The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens

Naihui Zhou(Iowa State University), Yuxiang Jiang(Indiana University Bloomington), Timothy Bergquist(University of Washington Medical Center), Alexandra Lee(Translational Therapeutics (United States)), Balint Z Kacsoh(Dartmouth College), Alex W. Crocker(Dartmouth College), Kimberley A. Lewis(Dartmouth College), George P. Georghiou(European Bioinformatics Institute), Huy Nguyen(Iowa State University), Md-Nafiz Hamid(Iowa State University), L. Taylor Davis, Tunca Doğan(European Bioinformatics Institute), Volkan Atalay(Middle East Technical University), Ahmet Süreyya Rifaioğlu(Middle East Technical University), Alperen Dalkıran(Middle East Technical University), Rengül Çetin-Atalay(Middle East Technical University), Chengxin Zhang(University of Michigan), Rebecca L. Hurto(University of Michigan), Peter L. Freddolino(University of Michigan), Yang Zhang(University of Michigan), Prajwal Bhat, Fran Supek(Institució Catalana de Recerca i Estudis Avançats), José M. Fernández(Barcelona Supercomputing Center), Branislava Gemović(University of Belgrade), Vladimir Perović(University of Belgrade), Radoslav Davidović(University of Belgrade), Neven Šumonja(University of Belgrade), Nevena Veljković(University of Belgrade), Ehsaneddin Asgari(University of California, Berkeley), Mohammad R. K. Mofrad(La Jolla Bioengineering Institute), Giuseppe Profiti(Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies), Castrense Savojardo(University of Bologna), Pier Luigi Martelli(University of Bologna), Rita Casadio(University of Bologna), Florian Boecker(University of Bonn), Heiko Schoof(University of Bonn), Indika Kahanda(Montana State University), Natalie Thurlby(University of Bristol), Alice C. McHardy(Helmholtz Centre for Infection Research), Alexandre Renaux(Université Libre de Bruxelles), Rabie Saidi(European Bioinformatics Institute), Julian Gough(MRC Laboratory of Molecular Biology), Alex A. Freitas(University of Kent), Magdalena Antczak(University of Kent), Fábio Fabris(University of Kent), Mark N. Wass(University of Kent), Jie Hou(University of Missouri), Jianlin Cheng(University of Missouri), Zheng Wang(University of Miami), Alfonso E. Romero(Royal Holloway University of London), Alberto Paccanaro(Royal Holloway University of London), Haixuan Yang(Ollscoil na Gaillimhe – University of Galway), Tatyana Goldberg(Technical University of Munich), Chenguang Zhao(Hattiesburg Clinic), Liisa Holm(University of Helsinki), Petri Törönen(University of Helsinki), Alan Medlar(University of Helsinki), Elaine Zosa(University of Helsinki), Itamar Borukhov(Compugen (Israel)), Ilya B. Novikov(Baylor College of Medicine), Angela D. Wilkins(Baylor College of Medicine), Olivier Lichtarge(Baylor College of Medicine), Po-Han Chi(National Tsing Hua University), Wei-Cheng Tseng(National Tsing Hua University), Michal Linial(Hebrew University of Jerusalem), Peter W. Rose(San Diego Supercomputer Center), Christophe Dessimoz(SIB Swiss Institute of Bioinformatics), Vedrana Vidulin(Jožef Stefan Institute), Sašo Džeroski(Jožef Stefan Institute), Ian Sillitoe(Institute of Structural and Molecular Biology), Sayoni Das(Institute of Structural and Molecular Biology), Jonathan Lees(Oxford Brookes University), David T. Jones(The Francis Crick Institute), Cen Wan(The Francis Crick Institute), Domenico Cozzetto(The Francis Crick Institute), Rui Fa(The Francis Crick Institute), Mateo Torres(Royal Holloway University of London), Alex Warwick Vesztrocy(SIB Swiss Institute of Bioinformatics), José Manuel Rodrı́guez(Spanish National Centre for Cardiovascular Research), Michael L. Tress(Spanish National Cancer Research Centre), Marco Frasca(University of Milan), Marco Notaro(University of Milan), Giuliano Grossi(University of Milan), Alessandro Petrini(University of Milan), Matteo Ré(University of Milan), Giorgio Valentini(University of Milan), Marco Mesiti(Centre National de la Recherche Scientifique), Daniel B. Roche(Technical University of Munich), Jonas Reeb(Technical University of Munich), David W. Ritchie(Centre National de la Recherche Scientifique), Sabeur Aridhi(Centre National de la Recherche Scientifique), Seyed Ziaeddin Alborzi(Centre National de la Recherche Scientifique), Marie‐Dominique Devignes(Centre National de la Recherche Scientifique), Da Chen Emily Koo(New York University), Richard Bonneau(Flatiron Health (United States)), Vladimir Gligorijević(Simons Foundation), Meet Barot(New York University), Hai Fang(Centre for Human Genetics), Stefano Toppo(University of Padua), Enrico Lavezzo(University of Padua), Marco Falda(University of Padua), Michele Berselli(University of Padua), Silvio C. E. Tosatto(University of Padua), Marco Carraro(University of Padua), Damiano Piovesan(University of Padua), Hafeez Ur Rehman(National University of Computer and Emerging Sciences), Qizhong Mao(University of California, Riverside), Shanshan Zhang(Temple University), Slobodan Vučetić(Temple University), Gage S. Black(Brigham Young University), Dane Jo(Brigham Young University), Erica Suh(Brigham Young University), Jonathan Dayton(Brigham Young University), Dallas J. Larsen(Brigham Young University), Ashton Omdahl(Brigham Young University), Liam J. McGuffin(University of Reading), Danielle A Brackenridge(University of Reading), Patricia C. Babbitt(University of California, San Francisco), Jeffrey M. Yunes(University of California, San Francisco), Paolo Fontana(Fondazione Edmund Mach), Feng Zhang(Fudan University), Shanfeng Zhu(Fudan University), Ronghui You(Fudan University), Zihan Zhang(Fudan University), Suyang Dai(Fudan University), Shuwei Yao(Fudan University), Weidong Tian(Cincinnati Children's Hospital Medical Center), Renzhi Cao(Pacific Lutheran University), Caleb Chandler(Pacific Lutheran University), Miguel Amezola(Pacific Lutheran University), D. Barrie Johnson(Pacific Lutheran University), Jia‐Ming Chang(National Chengchi University), Wen‐Hung Liao(National Chengchi University), Yiwei Liu(National Chengchi University), Stefano Pascarelli(Okinawa Institute of Science and Technology Graduate University), Yotam Frank(Tel Aviv University), Robert Hoehndorf(King Abdullah University of Science and Technology), Maxat Kulmanov(King Abdullah University of Science and Technology), Imane Boudellioua(King Abdullah University of Science and Technology), Gianfranco Politano(Politecnico di Torino), Stefano Di Carlo(Politecnico di Torino), Alfredo Benso(Politecnico di Torino), Kai Hakala(University of Turku), Filip Ginter(University of Turku), Farrokh Mehryary(University of Turku), Suwisa Kaewphan(University of Turku), Jari Björne(University of Turku), Hans Moen(University of Turku), Martti Tolvanen(University of Turku), Tapio Salakoski(University of Turku), Daisuke Kihara(University of Cincinnati), Aashish Jain(Purdue University West Lafayette), Tomislav Šmuc(Ruđer Bošković Institute), Adrian Altenhoff(SIB Swiss Institute of Bioinformatics), Asa Ben‐Hur(Colorado State University), Burkhard Rost(Bavarian State Research Center for Agriculture), Steven E. Brenner(University of California, Berkeley), Christine Orengo(Institute of Structural and Molecular Biology), Constance J. Jeffery(University of Illinois Chicago), Giovanni Bosco(Dartmouth College), Deborah A. Hogan(Dartmouth College), María Martin(European Bioinformatics Institute), Claire O’Donovan(European Bioinformatics Institute), Sean D. Mooney(University of Washington Medical Center), Casey S. Greene(Translational Therapeutics (United States)), Predrag Radivojac(Northeastern University), Iddo Friedberg(Iowa State University)
Genome biology
November 19, 2019
Cited by 478Open Access
Full Text

Abstract

BACKGROUND: The Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function. RESULTS: Here, we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole-genome mutation screening in Candida albicans and Pseudomonas aureginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility. We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-term memory. CONCLUSION: We conclude that while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than the expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. Finally, we report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bio-ontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens.


Related Papers