The InterPro protein families and domains database: 20 years on

Matthias Blum(European Bioinformatics Institute), Hsin-Yu Chang(European Bioinformatics Institute), Sara Chuguransky(European Bioinformatics Institute), Tiago Grego(European Bioinformatics Institute), Swaathi Kandasaamy(European Bioinformatics Institute), Alex Mitchell(European Bioinformatics Institute), Gift Nuka(European Bioinformatics Institute), Typhaine Paysan‐Lafosse(European Bioinformatics Institute), Matloob Qureshi(European Bioinformatics Institute), Shriya Raj(European Bioinformatics Institute), Lorna Richardson(European Bioinformatics Institute), Gustavo A Salazar(European Bioinformatics Institute), Lowri Williams(European Bioinformatics Institute), Peer Bork(European Molecular Biology Laboratory), Alan Bridge(SIB Swiss Institute of Bioinformatics), Julian Gough(MRC Laboratory of Molecular Biology), Daniel H. Haft(National Institutes of Health), Ivica Letunić(Biobyte Solutions (Germany)), Aron Marchler‐Bauer(National Institutes of Health), Huaiyu Mi(University of Southern California), Darren A. Natale(Georgetown University), Marco Necci(University of Padua), Christine Orengo(Institute of Structural and Molecular Biology), Arun Prasad Pandurangan(MRC Laboratory of Molecular Biology), Catherine Rivoire(SIB Swiss Institute of Bioinformatics), Christian J A Sigrist(SIB Swiss Institute of Bioinformatics), Ian Sillitoe(Institute of Structural and Molecular Biology), Narmada Thanki(National Institutes of Health), Paul D. Thomas(University of Southern California), Silvio C. E. Tosatto(University of Padua), Cathy Wu(Georgetown University), Alex Bateman(European Bioinformatics Institute), ROBERT FINN(European Bioinformatics Institute)
Nucleic Acids Research
October 23, 2020
Cited by 2,368Open Access
Full Text

Abstract

The InterPro database (https://www.ebi.ac.uk/interpro/) provides an integrative classification of protein sequences into families, and identifies functionally important domains and conserved sites. InterProScan is the underlying software that allows protein and nucleic acid sequences to be searched against InterPro's signatures. Signatures are predictive models which describe protein families, domains or sites, and are provided by multiple databases. InterPro combines signatures representing equivalent families, domains or sites, and provides additional information such as descriptions, literature references and Gene Ontology (GO) terms, to produce a comprehensive resource for protein classification. Founded in 1999, InterPro has become one of the most widely used resources for protein family annotation. Here, we report the status of InterPro (version 81.0) in its 20th year of operation, and its associated software, including updates to database content, the release of a new website and REST API, and performance improvements in InterProScan.


Related Papers