Current status and new features of the Consensus Coding Sequence database

Catherine M. Farrell(University of California, Santa Cruz), Nuala A. O’Leary(Howard Hughes Medical Institute), Rachel Harte(Howard Hughes Medical Institute), Jane Loveland(Howard Hughes Medical Institute), Laurens Wilming(Howard Hughes Medical Institute), Craig Wallin(Howard Hughes Medical Institute), Mark Diekhans(Howard Hughes Medical Institute), Daniel Barrell(Howard Hughes Medical Institute), Stephen M. J. Searle(Howard Hughes Medical Institute), Bronwen Aken(Howard Hughes Medical Institute), Susan M. Hiatt(Howard Hughes Medical Institute), Adam Frankish(Howard Hughes Medical Institute), Marie‐Marthe Suner(Howard Hughes Medical Institute), Bhanu Rajput(Howard Hughes Medical Institute), Charles A. Steward(Howard Hughes Medical Institute), Garth Brown(Howard Hughes Medical Institute), Ruth Bennett(Howard Hughes Medical Institute), Michael R. Murphy(Howard Hughes Medical Institute), Wendy Wu(Howard Hughes Medical Institute), Mike Kay(Howard Hughes Medical Institute), Jennifer Hart(Howard Hughes Medical Institute), Jeena Rajan(Howard Hughes Medical Institute), Janet A. Weber(Howard Hughes Medical Institute), Catherine Snow(Howard Hughes Medical Institute), Lillian D. Riddick(Howard Hughes Medical Institute), Toby Hunt(Howard Hughes Medical Institute), David Webb(Howard Hughes Medical Institute), Mark Thomas(Howard Hughes Medical Institute), Pamela Tamez(Howard Hughes Medical Institute), Sanjida H Rangwala(Howard Hughes Medical Institute), Kelly M. McGarvey(Howard Hughes Medical Institute), Shashikant Pujar(Howard Hughes Medical Institute), Andrei Shkeda(Howard Hughes Medical Institute), Jonathan M. Mudge(Howard Hughes Medical Institute), José M. González(Howard Hughes Medical Institute), James Gilbert(Howard Hughes Medical Institute), Stephen J. Trevanion(Howard Hughes Medical Institute), Robert Baertsch(Howard Hughes Medical Institute), Jennifer Harrow(Howard Hughes Medical Institute), Tim Hubbard(Howard Hughes Medical Institute), James M. Ostell(Howard Hughes Medical Institute), David Haussler(Howard Hughes Medical Institute), Kim D. Pruitt(University of California, Santa Cruz)
Nucleic Acids Research
November 11, 2013
Cited by 159Open Access
Full Text

Abstract

The Consensus Coding Sequence (CCDS) project (http://www.ncbi.nlm.nih.gov/CCDS/) is a collaborative effort to maintain a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assemblies by the National Center for Biotechnology Information (NCBI) and Ensembl genome annotation pipelines. Identical annotations that pass quality assurance tests are tracked with a stable identifier (CCDS ID). Members of the collaboration, who are from NCBI, the Wellcome Trust Sanger Institute and the University of California Santa Cruz, provide coordinated and continuous review of the dataset to ensure high-quality CCDS representations. We describe here the current status and recent growth in the CCDS dataset, as well as recent changes to the CCDS web and FTP sites. These changes include more explicit reporting about the NCBI and Ensembl annotation releases being compared, new search and display options, the addition of biologically descriptive information and our approach to representing genes for which support evidence is incomplete. We also present a summary of recent and future curation targets.


Related Papers

No related papers found

Powered by citation graph analysis