The InterPro protein families database: the classification resource after 15 years

Alex Mitchell(European Bioinformatics Institute), Hsin-Yu Chang(European Bioinformatics Institute), Louise C. Daugherty(European Bioinformatics Institute), Matthew Fraser(European Bioinformatics Institute), Sarah Hunter(European Bioinformatics Institute), Rodrigo López(European Bioinformatics Institute), Craig McAnulla(European Bioinformatics Institute), Conor McMenamin(European Bioinformatics Institute), Gift Nuka(European Bioinformatics Institute), Sebastien Pesseat(European Bioinformatics Institute), Amaia Sangrador‐Vegas(European Bioinformatics Institute), Maxim Scheremetjew(European Bioinformatics Institute), Cláudia Rato(European Bioinformatics Institute), Siew-Yit Yong(European Bioinformatics Institute), Alex Bateman(European Bioinformatics Institute), Marco Punta(European Bioinformatics Institute), Teresa K. Attwood(University of Manchester), Christian J A Sigrist(SIB Swiss Institute of Bioinformatics), Nicole Redaschi(SIB Swiss Institute of Bioinformatics), Catherine Rivoire(SIB Swiss Institute of Bioinformatics), Ioannis Xénarios(SIB Swiss Institute of Bioinformatics), Daniel Kahn(Université Claude Bernard Lyon 1), Dominique Guyot(Université Claude Bernard Lyon 1), Peer Bork(European Molecular Biology Laboratory), Ivica Letunić(European Molecular Biology Laboratory), Julian Gough(University of Bristol), Matt E. Oates(University of Bristol), Daniel H. Haft(J. Craig Venter Institute), Hongzhan Huang(Georgetown University), Darren A. Natale(Georgetown University), Cathy Wu(Georgetown University), Christine Orengo(University of London), Ian Sillitoe(University of London), Huaiyu Mi(University of Southern California), Paul D. Thomas(University of Southern California), ROBERT FINN(European Bioinformatics Institute)
Nucleic Acids Research
November 26, 2014
Cited by 1,303Open Access
Full Text

Abstract

The InterPro database (http://www.ebi.ac.uk/interpro/) is a freely available resource that can be used to classify sequences into protein families and to predict the presence of important domains and sites. Central to the InterPro database are predictive models, known as signatures, from a range of different protein family databases that have different biological focuses and use different methodological approaches to classify protein families and domains. InterPro integrates these signatures, capitalizing on the respective strengths of the individual databases, to produce a powerful protein classification resource. Here, we report on the status of InterPro as it enters its 15th year of operation, and give an overview of new developments with the database and its associated Web interfaces and software. In particular, the new domain architecture search tool is described and the process of mapping of Gene Ontology terms to InterPro is outlined. We also discuss the challenges faced by the resource given the explosive growth in sequence data in recent years. InterPro (version 48.0) contains 36,766 member database signatures integrated into 26,238 InterPro entries, an increase of over 3993 entries (5081 signatures), since 2012.


Related Papers

No related papers found

Powered by citation graph analysis