Multimodal single cell data integration challenge: results and lessons learned

Christopher Lance(Helmholtz Zentrum München), Malte D. Luecken(Helmholtz Zentrum München), Daniel B. Burkhardt, Robrecht Cannoodt(Ghent University Hospital), Pia Rautenstrauch(Max Delbrück Center), Anna Laddach(The Francis Crick Institute), Aidyn Ubingazhibov(Nazarbayev University), Zhi‐Jie Cao(Peking University), Kaiwen Deng(University of Michigan), Sumeer Ahmad Khan(Kootenay Association for Science & Technology), Qiao Liu(Stanford University), Nikolay Russkikh(Novel (United States)), Gleb Ryazantsev(Novel (United States)), Uwe Ohler(Max Delbrück Center), NeurIPS 2021 Multimodal data integration competition participants(Chan Zuckerberg Biohub San Francisco), Angela Oliveira Pisco(Chan Zuckerberg Biohub San Francisco), Jonathan Bloom(Yale University), Smita Krishnaswamy(Yale University), Fabian J. Theis(Helmholtz Zentrum München)
bioRxiv (Cold Spring Harbor Laboratory)
April 12, 2022
Cited by 79Open Access
Full Text

Abstract

Abstract Biology has become a data-intensive science. Recent technological advances in single-cell genomics have enabled the measurement of multiple facets of cellular state, producing datasets with millions of single-cell observations. While these data hold great promise for understanding molecular mechanisms in health and disease, analysis challenges arising from sparsity, technical and biological variability, and high dimensionality of the data hinder the derivation of such mechanistic insights. To promote the innovation of algorithms for analysis of multimodal single-cell data, we organized a competition at NeurIPS 2021 applying the Common Task Framework to multimodal single-cell data integration. For this competition we generated the first multimodal benchmarking dataset for single-cell biology and defined three tasks in this domain: prediction of missing modalities, aligning modalities, and learning a joint representation across modalities. We further specified evaluation metrics and developed a cloud-based algorithm evaluation pipeline. Using this setup, 280 competitors submitted over 2600 proposed solutions within a 3 month period, showcasing substantial innovation especially in the modality alignment task. Here, we present the results, describe trends of well performing approaches, and discuss challenges associated with running the competition.


Related Papers

No related papers found

Powered by citation graph analysis