A large-scale evaluation of computational protein function prediction

Predrag Radivojac(Indiana University Bloomington), Wyatt T. Clark(Indiana University Bloomington), Tal Oron(Buck Institute for Research on Aging), Alexandra M. Schnoes(University of California, San Francisco), Tobias Wittkop(Buck Institute for Research on Aging), Artem Sokolov(University of California, Santa Cruz), Kiley Graim(Colorado State University), Christopher S. Funk(University of Colorado Denver), Karin Verspoor(Data61), Asa Ben‐Hur(Colorado State University), Gaurav Pandey(University of California, Berkeley), Jeffrey M. Yunes(Graduate Theological Union), Ameet Talwalkar(University of California, Berkeley), Susanna Repo(European Bioinformatics Institute), Michael L Souza(University of California, Berkeley), Damiano Piovesan(University of Bologna), Rita Casadio(University of Bologna), Zheng Wang(University of Missouri), Jianlin Cheng(University of Missouri), Hai Fang(University of Bristol), Julian Gough(University of Bristol), Patrik Koskinen(University of Helsinki), Petri Törönen(University of Helsinki), Jussi Nokso-Koivisto(University of Helsinki), Liisa Holm(University of Helsinki), Domenico Cozzetto(University College London), Daniel Buchan(University College London), Kevin Bryson(University College London), David T. Jones(University College London), Bhakti Limaye(Centre for Development of Advanced Computing), Harshal Inamdar(Centre for Development of Advanced Computing), Avik Datta(Centre for Development of Advanced Computing), Sunitha K Manjari(Centre for Development of Advanced Computing), Rajendra Joshi(Centre for Development of Advanced Computing), Meghana Chitale(Purdue University West Lafayette), Daisuke Kihara(Purdue University West Lafayette), Andreas Martin Lisewski(Baylor College of Medicine), Serkan Erdin(Baylor College of Medicine), Eric Venner(Baylor College of Medicine), Olivier Lichtarge(Baylor College of Medicine), Robert Rentzsch(Institute of Structural and Molecular Biology), Haixuan Yang(Royal Holloway University of London), Alfonso E. Romero(Royal Holloway University of London), Prajwal Bhat(Royal Holloway University of London), Alberto Paccanaro(Royal Holloway University of London), Tobias Hamp(Technical University of Munich), Rebecca Kaßner(Technical University of Munich), Stefan Seemayer(Technical University of Munich), Esmeralda Vicedo(Technical University of Munich), Christian Schaefer(Technical University of Munich), Dominik Achten(Technical University of Munich), Florian Auer(Technical University of Munich), Ariane C. Boehm(Technical University of Munich), Tatjana Braun(Technical University of Munich), Maximilian Hecht(Technical University of Munich), B. Mark Heron(Technical University of Munich), Peter Hönigschmid(Technical University of Munich), Thomas A. Hopf(Technical University of Munich), Stefanie Kaufmann(Technical University of Munich), Michael Kiening(Technical University of Munich), Denis Krompaß(Technical University of Munich), Cedric Landerer(Technical University of Munich), Yannick Mahlich(Technical University of Munich), Manfred Roos(Technical University of Munich), Jari Björne(University of Turku), Tapio Salakoski(University of Turku), Andrew Wong(Queen's University), Hagit Shatkay(Queen's University), Fanny Gatzmann(Max Planck Institute for Informatics), I. Sommer(Max Planck Institute for Informatics), Mark N. Wass(Spanish National Cancer Research Centre), Michael J.E. Sternberg(Imperial College London), Nives Škunca(Ruđer Bošković Institute), Fran Supek(Ruđer Bošković Institute), Matko Bošnjak(Ruđer Bošković Institute), Panče Panov(Jožef Stefan Institute), Sašo Džeroski(Jožef Stefan Institute), Tomislav Šmuc(Ruđer Bošković Institute), Yiannis Kourmpetis(SIB Swiss Institute of Bioinformatics), Aalt D. J. van Dijk(Wageningen University & Research), Cajo J. F. ter Braak(Wageningen University & Research), Yuanpeng Zhou(Fudan University), Qingtian Gong(Fudan University), Xinran Dong(Fudan University), Weidong Tian(Fudan University), Marco Falda(University of Padua), Paolo Fontana(Fondazione Edmund Mach), Enrico Lavezzo(University of Padua), Barbara Di Camillo(University of Padua), Stefano Toppo(University of Padua), Liang Lan(Temple University), Nemanja Djuric(Temple University), Yuhong Guo(Temple University), Slobodan Vučetić(Temple University), Amos Bairoch(University of Geneva), Michal Linial(Hebrew University of Jerusalem), Patricia C. Babbitt(University of California, San Francisco), Steven E. Brenner(University of California, Berkeley), Christine Orengo(Institute of Structural and Molecular Biology), Burkhard Rost(Technical University of Munich), Sean D. Mooney(Buck Institute for Research on Aging), Iddo Friedberg(Miami University)
Nature Methods
January 27, 2013
Cited by 1,090Open Access
Full Text

Abstract

Automated annotation of protein function is challenging. As the number of sequenced genomes rapidly grows, the overwhelming majority of protein products can only be annotated computationally. If computational predictions are to be relied upon, it is crucial that the accuracy of these methods be high. Here we report the results from the first large-scale community-based critical assessment of protein function annotation (CAFA) experiment. Fifty-four methods representing the state of the art for protein function prediction were evaluated on a target set of 866 proteins from 11 organisms. Two findings stand out: (i) today's best protein function prediction algorithms substantially outperform widely used first-generation methods, with large gains on all types of targets; and (ii) although the top methods perform well enough to guide experiments, there is considerable need for improvement of currently available tools.


Related Papers