A roadmap for the functional annotation of protein families: a community perspective

Valérie de Crécy‐Lagard(University of Florida), Rocío Amorín(University of Florida), Cecilia N. Arighi(University of Delaware), Jill Babor(University of Florida), Alex Bateman(European Bioinformatics Institute), Ian K. Blaby(Lawrence Berkeley National Laboratory), Crysten E. Blaby‐Haas(Brookhaven National Laboratory), Alan Bridge(SIB Swiss Institute of Bioinformatics), S.K. Burley(Rutgers, The State University of New Jersey), Stacey Cleveland(University of Florida), Lucy J. Colwell(University of Cambridge), Ana Conesa(Consejo Superior de Investigaciones Científicas), Christian Dallago(Technical University of Munich), Antoine Danchin(University of Hong Kong), Anita de Waard(RELX Group (United States)), Adam M. Deutschbauer(Lawrence Berkeley National Laboratory), Raquel Dias(University of Florida), Yousong Ding(University of Florida), Gang Fang(New York University Shanghai), Iddo Friedberg(Iowa State University), J.A. Gerlt(University of Illinois Urbana-Champaign), Joshua E. Goldford(Living Systems (United States)), Mark G. Gorelik(University of Florida), Benjamin M. Gyori(Harvard University), Christopher S. Henry(Argonne National Laboratory), Geoffrey Hutinet(University of Florida), Marshall Jaroch(University of Florida), Peter D. Karp(SRI International), Liudmyla Kondratova(University of Florida), Zhiyong Lu(National Institutes of Health), Aron Marchler‐Bauer(National Institutes of Health), María Martin(European Bioinformatics Institute), Claire D. McWhite(Princeton University), Gaurav D. Moghe(Cornell University), Paul Monaghan(University of Florida), Anne Morgat(SIB Swiss Institute of Bioinformatics), Chris Mungall(Lawrence Berkeley National Laboratory), Darren A. Natale(Georgetown University), William Nelson(Pacific Northwest National Laboratory), Séan O’Donoghue(UNSW Sydney), Christine Orengo(Institute of Structural and Molecular Biology), Katherine H. O’Toole(New England Biolabs (United States)), Predrag Radivojac(Northeastern University), Colbie Reed(University of Florida), Richard J. Roberts(New England Biolabs (United States)), Dmitri Rodionov(Sanford Burnham Prebys Medical Discovery Institute), Irina A. Rodionova(Sanford Burnham Prebys Medical Discovery Institute), Jeffrey D. Rudolf(University of Florida), Lana Saleh(New England Biolabs (United States)), Gloria Sheynkman(University of Virginia), Francoise Thibaud-Nissen(National Institutes of Health), Paul D. Thomas(University of Southern California), Peter Uetz(Virginia Commonwealth University), David Vallenet(Centre National de la Recherche Scientifique), Erica W. Carter(Florida Department of Citrus), Peter Weigele(New England Biolabs (United States)), Valerie Wood(University of Cambridge), Elisha M. Wood‐Charlson(Lawrence Berkeley National Laboratory), Jin Xu(Florida Department of Citrus)
Database
January 1, 2022
Cited by 62Open Access
Full Text

Abstract

Over the last 25 years, biology has entered the genomic era and is becoming a science of 'big data'. Most interpretations of genomic analyses rely on accurate functional annotations of the proteins encoded by more than 500 000 genomes sequenced to date. By different estimates, only half the predicted sequenced proteins carry an accurate functional annotation, and this percentage varies drastically between different organismal lineages. Such a large gap in knowledge hampers all aspects of biological enterprise and, thereby, is standing in the way of genomic biology reaching its full potential. A brainstorming meeting to address this issue funded by the National Science Foundation was held during 3-4 February 2022. Bringing together data scientists, biocurators, computational biologists and experimentalists within the same venue allowed for a comprehensive assessment of the current state of functional annotations of protein families. Further, major issues that were obstructing the field were identified and discussed, which ultimately allowed for the proposal of solutions on how to move forward.


Related Papers