Integrated analysis of multimodal single-cell data

Yuhan Hao(New York Genome Center), Stephanie Hao(New York Genome Center), Erica Andersen‐Nissen(Cape Town HVTN Immunology Laboratory / Hutchinson Centre Research Institute of South Africa), William M. Mauck(New York University), Shiwei Zheng(New York Genome Center), Andrew Butler(New York Genome Center), Madeline J. Lee(Stanford University), Aaron J. Wilk(Stanford University), Charlotte A. Darby(New York University), Michael Zager(Fred Hutch Cancer Center), Paul Hoffman(New York University), Marlon Stoeckius(New York Genome Center), Efthymia Papalexi(New York Genome Center), Eleni P. Mimitou(New York Genome Center), Jaison Jain(New York University), Avi Srivastava(New York University), Tim Stuart(New York University), Lamar M. Fleming(Fred Hutch Cancer Center), Bertrand Z. Yeung(BioLegend (United States)), Angela J. Rogers(Stanford University), M. Juliana McElrath(Fred Hutch Cancer Center), Catherine A. Blish(Chan Zuckerberg Initiative (United States)), Raphaël Gottardo(Fred Hutch Cancer Center), Peter Smibert(New York Genome Center), Rahul Satija(New York Genome Center)
Cited by 15,791Open Access
Full Text

Abstract

The simultaneous measurement of multiple modalities represents an exciting frontier for single-cell genomics and necessitates computational methods that can define cellular states based on multimodal data. Here, we introduce "weighted-nearest neighbor" analysis, an unsupervised framework to learn the relative utility of each data type in each cell, enabling an integrative analysis of multiple modalities. We apply our procedure to a CITE-seq dataset of 211,000 human peripheral blood mononuclear cells (PBMCs) with panels extending to 228 antibodies to construct a multimodal reference atlas of the circulating immune system. Multimodal analysis substantially improves our ability to resolve cell states, allowing us to identify and validate previously unreported lymphoid subpopulations. Moreover, we demonstrate how to leverage this reference to rapidly map new datasets and to interpret immune responses to vaccination and coronavirus disease 2019 (COVID-19). Our approach represents a broadly applicable strategy to analyze single-cell multimodal datasets and to look beyond the transcriptome toward a unified and multimodal definition of cellular identity.


Related Papers