A joint NCBI and EMBL-EBI transcript set for clinical genomics and research

Joannella Morales; Shashikant Pujar; Jane Loveland; Alex Astashyn; Ruth Bennett; Andrew Berry; Eric Cox; Claire Davidson; Olga Ermolaeva; Catherine M. Farrell; Reham Fatima; Laurent Gil; Tamara Goldfarb; José M. González; Diana Haddad; Matthew P. Hardy; Toby Hunt; John D. Jackson; Vinita Joardar; Mike Kay; Vamsi K. Kodali; Kelly M. McGarvey; Aoife McMahon; Jonathan M. Mudge; Daniel N. Murphy; Michael R. Murphy; Bhanu Rajput; Sanjida H Rangwala; Lillian D. Riddick; Françoise Thibaud‐Nissen; Glen Threadgold; Anjana R. Vatsan; Craig Wallin; David Webb; Paul Flicek; Ewan Birney; Kim D. Pruitt; Adam Frankish; Fiona Cunningham; Terence D. Murphy

doi:10.1038/s41586-022-04558-8

A joint NCBI and EMBL-EBI transcript set for clinical genomics and research

Joannella Morales(European Bioinformatics Institute), Shashikant Pujar(National Institutes of Health), Jane Loveland(European Bioinformatics Institute), Alex Astashyn(National Institutes of Health), Ruth Bennett(European Bioinformatics Institute), Andrew Berry(European Bioinformatics Institute), Eric Cox(National Institutes of Health), Claire Davidson(European Bioinformatics Institute), Olga Ermolaeva(National Institutes of Health), Catherine M. Farrell(National Institutes of Health), Reham Fatima(European Bioinformatics Institute), Laurent Gil(European Bioinformatics Institute), Tamara Goldfarb(National Institutes of Health), José M. González(European Bioinformatics Institute), Diana Haddad(National Institutes of Health), Matthew P. Hardy(European Bioinformatics Institute), Toby Hunt(European Bioinformatics Institute), John D. Jackson(National Institutes of Health), Vinita Joardar(National Institutes of Health), Mike Kay(European Bioinformatics Institute), Vamsi K. Kodali(National Institutes of Health), Kelly M. McGarvey(National Institutes of Health), Aoife McMahon(European Bioinformatics Institute), Jonathan M. Mudge(European Bioinformatics Institute), Daniel N. Murphy(European Bioinformatics Institute), Michael R. Murphy(National Institutes of Health), Bhanu Rajput(National Institutes of Health), Sanjida H Rangwala(National Institutes of Health), Lillian D. Riddick(National Institutes of Health), Françoise Thibaud‐Nissen(National Institutes of Health), Glen Threadgold(European Bioinformatics Institute), Anjana R. Vatsan(National Institutes of Health), Craig Wallin(National Institutes of Health), David Webb(National Institutes of Health), Paul Flicek(European Bioinformatics Institute), Ewan Birney(European Bioinformatics Institute), Kim D. Pruitt(National Institutes of Health), Adam Frankish(European Bioinformatics Institute), Fiona Cunningham(European Bioinformatics Institute), Terence D. Murphy(National Institutes of Health)

Nature

April 6, 2022

10.1038/s41586-022-04558-8

Cited by 582Open Access

Full Text

Abstract

Abstract Comprehensive genome annotation is essential to understand the impact of clinically relevant variants. However, the absence of a standard for clinical reporting and browser display complicates the process of consistent interpretation and reporting. To address these challenges, Ensembl/GENCODE 1 and RefSeq 2 launched a joint initiative, the Matched Annotation from NCBI and EMBL-EBI (MANE) collaboration, to converge on human gene and transcript annotation and to jointly define a high-value set of transcripts and corresponding proteins. Here, we describe the MANE transcript sets for use as universal standards for variant reporting and browser display. The MANE Select set identifies a representative transcript for each human protein-coding gene, whereas the MANE Plus Clinical set provides additional transcripts at loci where the Select transcripts alone are not sufficient to report all currently known clinical variants. Each MANE transcript represents an exact match between the exonic sequences of an Ensembl/GENCODE transcript and its counterpart in RefSeq such that the identifiers can be used synonymously. We have now released MANE Select transcripts for 97% of human protein-coding genes, including all American College of Medical Genetics and Genomics Secondary Findings list v3.0 (ref. 3 ) genes. MANE transcripts are accessible from major genome browsers and key resources. Widespread adoption of these transcript sets will increase the consistency of reporting, facilitate the exchange of data regardless of the annotation source and help to streamline clinical interpretation.

Related Papers

No related papers found

Powered by citation graph analysis