UniRule: a unified rule resource for automatic annotation in the UniProt Knowledgebase

Alistair MacDougall(European Bioinformatics Institute), Vladimir Volynkin(European Bioinformatics Institute), Rabie Saidi(European Bioinformatics Institute), Diego Poggioli(European Bioinformatics Institute), Hermann Zellner(European Bioinformatics Institute), Emma Hatton-Ellis(European Bioinformatics Institute), Vishal Joshi(European Bioinformatics Institute), Claire O’Donovan(European Bioinformatics Institute), Sandra Orchard(European Bioinformatics Institute), Andrea H Auchincloss(SIB Swiss Institute of Bioinformatics), Delphine Baratin(SIB Swiss Institute of Bioinformatics), Jerven Bolleman(SIB Swiss Institute of Bioinformatics), Elisabeth Coudert(SIB Swiss Institute of Bioinformatics), Edouard de Castro(SIB Swiss Institute of Bioinformatics), Chantal Hulo(SIB Swiss Institute of Bioinformatics), Patrick Masson(SIB Swiss Institute of Bioinformatics), Ivo Pedruzzi(SIB Swiss Institute of Bioinformatics), Catherine Rivoire(SIB Swiss Institute of Bioinformatics), Cecilia N. Arighi(University of Delaware), Qinghua Wang(University of Delaware), Chuming Chen(University of Delaware), Hongzhan Huang(University of Delaware), John S. Garavelli(University of Delaware), C R Vinayaka(Georgetown University), Lai-Su Yeh(Georgetown University), Darren A. Natale(Georgetown University), Kati Laiho(Georgetown University), María Martin(European Bioinformatics Institute), Alexandre Renaux(European Bioinformatics Institute), Klemens Pichler(European Bioinformatics Institute), Alex Bateman(European Bioinformatics Institute), Alan Bridge, Cathy Wu, Cecilia N. Arighi(University of Delaware), Lionel Breuza(SIB Swiss Institute of Bioinformatics), Elisabeth Coudert(SIB Swiss Institute of Bioinformatics), Hongzhan Huang(University of Delaware), Damien Lieberherr, Michele Magrane(European Bioinformatics Institute), María Martin(European Bioinformatics Institute), Peter B. McGarvey(Georgetown University), Darren A. Natale(European Bioinformatics Institute), Sandra Orchard(SIB Swiss Institute of Bioinformatics), Ivo Pedruzzi(SIB Swiss Institute of Bioinformatics), Sylvain Poux, Manuela Pruess, Shriya Raj, Nicole Redaschi, Lucila Aimo, Ghislaine Argoud‐Puy(SIB Swiss Institute of Bioinformatics), Andrea H Auchincloss(SIB Swiss Institute of Bioinformatics), Kristian B. Axelsen, Emmanuel Boutet, Emily Bowler-Barnett, Ramona Britto, Hema Bye‐A‐Jee, Cristina Casals‐Casas, Paul Denny, Anne Estreicher, Maria Livia Famiglietti, Marc Feuermann(University of Delaware), John S. Garavelli(University of Delaware), Penelope Garmiri, Arnaud Gos, Nadine Gruaz(European Bioinformatics Institute), Emma Hatton-Ellis(SIB Swiss Institute of Bioinformatics), Chantal Hulo(SIB Swiss Institute of Bioinformatics), Nevila Hyka‐Nouspikel, Florence Jungo(Georgetown University), Kati Laiho(Georgetown University), Philippe Le Mercier, Antonia Lock, Yvonne Lussi(European Bioinformatics Institute), Alistair MacDougall(SIB Swiss Institute of Bioinformatics), Patrick Masson(SIB Swiss Institute of Bioinformatics), Anne Morgat, Sandrine Pilbout, Lucille Pourcel(SIB Swiss Institute of Bioinformatics), Catherine Rivoire(SIB Swiss Institute of Bioinformatics), Karen Ross, Christian J A Sigrist, Elena Speretta, Shyamala Sundaram, Nidhi Tyagi(Georgetown University), C R Vinayaka(Georgetown University), Qinghua Wang(University of Delaware), Kate Warner(Georgetown University), Lai-Su Yeh(Georgetown University), Rossana Zaru, Shadab Ahmed, Emanuele Alpi, Leslie Arminski, Parit Bansal(SIB Swiss Institute of Bioinformatics), Delphine Baratin(SIB Swiss Institute of Bioinformatics), Teresa Batista Neto(SIB Swiss Institute of Bioinformatics), Jerven Bolleman(SIB Swiss Institute of Bioinformatics), Chuming Chen(University of Delaware), Yongxing Chen, Beatrice Cuche, Austra Cukura(SIB Swiss Institute of Bioinformatics), Edouard De Castro(SIB Swiss Institute of Bioinformatics), ThankGod E. Ebenezer(SIB Swiss Institute of Bioinformatics), Elisabeth Gasteiger, Sébastien Géhant, Leonardo Jose da Costa Gonzales, Abdulrahman Hussein(European Bioinformatics Institute), Alexandr Ignatchenko, Giuseppe Insana, Rizwan Ishtiaq(European Bioinformatics Institute), Vishal Joshi(European Bioinformatics Institute), Dushyanth Jyothi, Arnaud Kerhornou, Thierry Lombardot, Aurelian Luciani, Jie Luo, Mahdi Mahmoudy, Alok Mishra, Katie Moulang, Andrew Nightingale, Joseph Onwubiko, Monica Pozzato, Sangya Pundir(University of Delaware), Guoying Qi, Daniel L Rice(European Bioinformatics Institute), Rabie Saidi(European Bioinformatics Institute), E. B. Turner, Preethi Vasudev(University of Delaware), Yuqi Wang, Xavier Watkins(European Bioinformatics Institute), Hermann Zellner(European Bioinformatics Institute), Jian Zhang
Bioinformatics
May 5, 2020
Cited by 75Open Access
Full Text

Abstract

MOTIVATION: The number of protein records in the UniProt Knowledgebase (UniProtKB: https://www.uniprot.org) continues to grow rapidly as a result of genome sequencing and the prediction of protein-coding genes. Providing functional annotation for these proteins presents a significant and continuing challenge. RESULTS: In response to this challenge, UniProt has developed a method of annotation, known as UniRule, based on expertly curated rules, which integrates related systems (RuleBase, HAMAP, PIRSR, PIRNR) developed by the members of the UniProt consortium. UniRule uses protein family signatures from InterPro, combined with taxonomic and other constraints, to select sets of reviewed proteins which have common functional properties supported by experimental evidence. This annotation is propagated to unreviewed records in UniProtKB that meet the same selection criteria, most of which do not have (and are never likely to have) experimentally verified functional annotation. Release 2020_01 of UniProtKB contains 6496 UniRule rules which provide annotation for 53 million proteins, accounting for 30% of the 178 million records in UniProtKB. UniRule provides scalable enrichment of annotation in UniProtKB. AVAILABILITY AND IMPLEMENTATION: UniRule rules are integrated into UniProtKB and can be viewed at https://www.uniprot.org/unirule/. UniRule rules and the code required to run the rules, are publicly available for researchers who wish to annotate their own sequences. The implementation used to run the rules is known as UniFIRE and is available at https://gitlab.ebi.ac.uk/uniprot-public/unifire.


Related Papers

No related papers found

Powered by citation graph analysis