Why rankings of biomedical image analysis competitions should be interpreted with care

Lena Maier‐Hein; Matthias Eisenmann; Annika Reinke; Sinan Onogur; Marko Stankovic; Patrick Godau; Tal Arbel; Hrvoje Bogunović; Andrew P. Bradley; Aaron Carass; Carolin Feldmann; Alejandro F. Frangi; Peter M. Full; Bram van Ginneken; Allan Hanbury; Katrin Honauer; Michal Kozubek; Bennett A. Landman; Keno März; Oskar Maier; Klaus Maier‐Hein; Bjoern Menze; Henning Müller; Peter Neher; Wiro J. Niessen; Nasir Rajpoot; G Sharp; Korsuk Sirinukunwattana; Stefanie Speidel; Christian Stock; Danail Stoyanov; Abdel Aziz Taha; Fons van der Sommen; Ching‐Wei Wang; Marc-André Weber; Guoyan Zheng; Pierre Jannin; Annette Kopp‐Schneider

doi:10.1038/s41467-018-07619-7

Why rankings of biomedical image analysis competitions should be interpreted with care

Lena Maier‐Hein(German Cancer Research Center), Matthias Eisenmann(German Cancer Research Center), Annika Reinke(German Cancer Research Center), Sinan Onogur(German Cancer Research Center), Marko Stankovic(German Cancer Research Center), Patrick Godau(German Cancer Research Center), Tal Arbel(McGill University), Hrvoje Bogunović(Christian Doppler Laboratory for Thermoelectricity), Andrew P. Bradley(Queensland University of Technology), Aaron Carass(Johns Hopkins University), Carolin Feldmann(German Cancer Research Center), Alejandro F. Frangi(University of Leeds), Peter M. Full(German Cancer Research Center), Bram van Ginneken(Radboud University Nijmegen), Allan Hanbury(TU Wien), Katrin Honauer(Heidelberg University), Michal Kozubek(Masaryk University), Bennett A. Landman(Vanderbilt University), Keno März(German Cancer Research Center), Oskar Maier(University of Lübeck), Klaus Maier‐Hein(German Cancer Research Center), Bjoern Menze(Technical University of Munich), Henning Müller(HES-SO University of Applied Sciences and Arts Western Switzerland), Peter Neher(German Cancer Research Center), Wiro J. Niessen(Erasmus MC), Nasir Rajpoot(University of Warwick), G Sharp(Massachusetts General Hospital), Korsuk Sirinukunwattana(University of Oxford), Stefanie Speidel(National Center for Tumor Diseases), Christian Stock(German Cancer Research Center), Danail Stoyanov(University College London), Abdel Aziz Taha(Research Studios Austria), Fons van der Sommen(Eindhoven University of Technology), Ching‐Wei Wang(National Taiwan University of Science and Technology), Marc-André Weber(University of Rostock), Guoyan Zheng(University of Bern), Pierre Jannin(Inserm), Annette Kopp‐Schneider(German Cancer Research Center)

Nature Communications

November 30, 2018

10.1038/s41467-018-07619-7

Cited by 362Open Access

Full Text

Abstract

International challenges have become the standard for validation of biomedical image analysis methods. Given their scientific impact, it is surprising that a critical analysis of common practices related to the organization of challenges has not yet been performed. In this paper, we present a comprehensive analysis of biomedical image analysis challenges conducted up to now. We demonstrate the importance of challenges and show that the lack of quality control has critical consequences. First, reproducibility and interpretation of the results is often hampered as only a fraction of relevant information is typically provided. Second, the rank of an algorithm is generally not robust to a number of variables such as the test data used for validation, the ranking scheme applied and the observers that make the reference annotations. To overcome these problems, we recommend best practice guidelines and define open research questions to be addressed in the future.

Lee R. Dice|Ecology|1945|11.9k

The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS)

Bjoern Menze, András Jakab, Stefan Bauer et al.|IEEE Transactions on Medical Imaging|2014|6.5k

A NEW MEASURE OF RANK CORRELATION

M. G. Kendall|Biometrika|1938|5.8k

Metrics for evaluating 3D medical image segmentation: analysis, selection, and tool

Abdel Aziz Taha, Allan Hanbury|BMC Medical Imaging|2015|2.7k

The Impact of eHealth on the Quality and Safety of Health Care: A Systematic Overview

Ashly Black, Josip Car, Claudia Pagliari et al.|PLoS Medicine|2011|1.5k

Why rankings of biomedical image analysis competitions should be interpreted with care

Abstract

Related Papers