Overview of BioCreative II gene mention recognition

Larry Smith(National Center for Biotechnology Information), Lorraine Tanabe(National Center for Biotechnology Information), R. Ando, Cheng-Ju Kuo(National Yang Ming Chiao Tung University), I‐Fang Chung(National Yang Ming Chiao Tung University), Chun‐Nan Hsu(Institute of Information Science, Academia Sinica), Yu-Shi Lin(Institute of Information Science, Academia Sinica), Roman Klinger(Fraunhofer Institute for Algorithms and Scientific Computing), Christoph M. Friedrich(Fraunhofer Institute for Algorithms and Scientific Computing), Kuzman Ganchev(University of Pennsylvania), Manabu Torii(Georgetown University), Hongfang Liu(Georgetown University), Barry Haddow(University of Edinburgh), Craig A. Struble(Marquette University), Richard J. Povinelli(Marquette University), Andreas Vlachos(University of Cambridge), William A. Baumgartner(University of Colorado Denver), Lawrence Hunter(University of Colorado Denver), Bob Carpenter, Richard Tzong‐Han Tsai(Institute of Information Science, Academia Sinica), Hong-Jie Dai(National Tsing Hua University), Feng Liu(Vrije Universiteit Brussel), Yifei Chen(Vrije Universiteit Brussel), Chengjie Sun(Harbin Institute of Technology), Sophia Katrenko(University of Amsterdam), Pieter Adriaans(University of Amsterdam), Christian Blaschke, Rafael Torres, Mariana Neves(Universidad Complutense de Madrid), Preslav Nakov(Institute for Parallel Processing), Anna Divoli(University of California, Berkeley), Manuel Jesús Maña López(Universidad de Huelva), Jacinto Mata Vázquez(Universidad de Huelva), W. John Wilbur(National Center for Biotechnology Information)
Genome biology
September 1, 2008
Cited by 489Open Access
Full Text

Abstract

Nineteen teams presented results for the Gene Mention Task at the BioCreative II Workshop. In this task participants designed systems to identify substrings in sentences corresponding to gene name mentions. A variety of different methods were used and the results varied with a highest achieved F1 score of 0.8721. Here we present brief descriptions of all the methods used and a statistical analysis of the results. We also demonstrate that, by combining the results from all submissions, an F score of 0.9066 is feasible, and furthermore that the best result makes use of the lowest scoring submissions.


Related Papers