CDD: a Conserved Domain Database for the functional annotation of proteins

Aron Marchler‐Bauer(National Center for Biotechnology Information), Shuo Lu(National Center for Biotechnology Information), J. B. Anderson(National Center for Biotechnology Information), Farideh Chitsaz(National Center for Biotechnology Information), Myra K. Derbyshire(National Center for Biotechnology Information), C. DeWeese-Scott(National Center for Biotechnology Information), Jessica H. Fong(National Center for Biotechnology Information), Lewis Y. Geer(National Center for Biotechnology Information), Renata C. Geer(National Center for Biotechnology Information), Nicole R. Gonzales(National Center for Biotechnology Information), Marya Gwadz(National Center for Biotechnology Information), David I. Hurwitz(National Institutes of Health), John David Jackson(National Center for Biotechnology Information), Zhaoxi Ke(National Institutes of Health), Christopher J. Lanczycki(National Center for Biotechnology Information), Fuping Lu(National Center for Biotechnology Information), Gabriele H. Marchler(National Center for Biotechnology Information), Michael Mullokandov(National Institutes of Health), Marina V. Omelchenko(National Institutes of Health), Courtney Robertson(National Center for Biotechnology Information), James S. Song(National Center for Biotechnology Information), Narmada Thanki(National Institutes of Health), Roxanne A. Yamashita(National Center for Biotechnology Information), Dapeng Zhang(National Center for Biotechnology Information), Nong Zhang(National Center for Biotechnology Information), Chao Zheng(National Center for Biotechnology Information), Stephen H. Bryant(National Center for Biotechnology Information)
Nucleic Acids Research
November 24, 2010
Cited by 3,233Open Access
Full Text

Abstract

NCBI's Conserved Domain Database (CDD) is a resource for the annotation of protein sequences with the location of conserved domain footprints, and functional sites inferred from these footprints. CDD includes manually curated domain models that make use of protein 3D structure to refine domain models and provide insights into sequence/structure/function relationships. Manually curated models are organized hierarchically if they describe domain families that are clearly related by common descent. As CDD also imports domain family models from a variety of external sources, it is a partially redundant collection. To simplify protein annotation, redundant models and models describing homologous families are clustered into superfamilies. By default, domain footprints are annotated with the corresponding superfamily designation, on top of which specific annotation may indicate high-confidence assignment of family membership. Pre-computed domain annotation is available for proteins in the Entrez/Protein dataset, and a novel interface, Batch CD-Search, allows the computation and download of annotation for large sets of protein queries. CDD can be accessed via http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml.


Related Papers

No related papers found

Powered by citation graph analysis