NCBI Reference Sequences: current status, policy and new initiatives

Kim D. Pruitt; Tatiana Tatusova; William Klimke; Donna Maglott

doi:10.1093/nar/gkn721

NCBI Reference Sequences: current status, policy and new initiatives

Kim D. Pruitt(National Institutes of Health), Tatiana Tatusova(National Institutes of Health), William Klimke(National Institutes of Health), Donna Maglott(National Institutes of Health)

Nucleic Acids Research

October 17, 2008

10.1093/nar/gkn721

Cited by 748Open Access

Full Text

Abstract

NCBI's Reference Sequence (RefSeq) database (http://www.ncbi.nlm.nih.gov/RefSeq/) is a curated non-redundant collection of sequences representing genomes, transcripts and proteins. RefSeq records integrate information from multiple sources and represent a current description of the sequence, the gene and sequence features. The database includes over 5300 organisms spanning prokaryotes, eukaryotes and viruses, with records for more than 5.5 x 10(6) proteins (RefSeq release 30). Feature annotation is applied by a combination of curation, collaboration, propagation from other sources and computation. We report here on the recent growth of the database, recent changes to feature annotations and record types for eukaryotic (primarily vertebrate) species and policies regarding species inclusion and genome annotation. In addition, we introduce RefSeqGene, a new initiative to support reporting variation data on a stable genomic coordinate system.

Related Papers

Basic local alignment search tool

Stephen F. Altschul, Warren Gish, Webb Miller et al.|Journal of Molecular Biology|1990|94.2k

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs

Stephen F. Altschul|Nucleic Acids Research|1997|74.4k

Basic Local Alignment Search Tool

Stephen F. Altschul|Journal of Molecular Biology|1990|13.9k

tRNAscan-SE: A Program for Improved Detection of Transfer RNA Genes in Genomic Sequence

Todd M. Lowe, Sean R. Eddy|Nucleic Acids Research|1997|11.2k

Database resources of the National Center for Biotechnology Information: update

David Wheeler|Nucleic Acids Research|2003|11k