Integrated analysis of multimodal single-cell data

Yuhan Hao(New York Genome Center), Stephanie Hao(New York Genome Center), Erica Andersen‐Nissen(Fred Hutch Cancer Center), William M. Mauck(New York University), Shiwei Zheng(New York Genome Center), Andrew Butler(New York Genome Center), Maddie J. Lee(Stanford University), Aaron J. Wilk(Stanford University), Charlotte A. Darby(New York University), Michael Zagar(Fred Hutch Cancer Center), Paul Hoffman(New York University), Marlon Stoeckius(New York Genome Center), Efthymia Papalexi(New York Genome Center), Eleni P. Mimitou(New York Genome Center), Jaison Jain(New York University), Avi Srivastava(New York University), Tim Stuart(New York University), Lamar Ballweber-Fleming(Fred Hutch Cancer Center), Bertrand Z. Yeung(BioLegend (United States)), Angela J. Rogers(Stanford University), M. Juliana McElrath(Fred Hutch Cancer Center), Catherine A. Blish(Chan Zuckerberg Initiative (United States)), Raphaël Gottardo(Fred Hutch Cancer Center), Peter Smibert(New York Genome Center), Rahul Satija(New York Genome Center)
bioRxiv (Cold Spring Harbor Laboratory)
October 12, 2020
Cited by 484Open Access
Full Text

Abstract

Abstract The simultaneous measurement of multiple modalities, known as multimodal analysis, represents an exciting frontier for single-cell genomics and necessitates new computational methods that can define cellular states based on multiple data types. Here, we introduce ‘weighted-nearest neighbor’ analysis, an unsupervised framework to learn the relative utility of each data type in each cell, enabling an integrative analysis of multiple modalities. We apply our procedure to a CITE-seq dataset of hundreds of thousands of human white blood cells alongside a panel of 228 antibodies to construct a multimodal reference atlas of the circulating immune system. We demonstrate that integrative analysis substantially improves our ability to resolve cell states and validate the presence of previously unreported lymphoid subpopulations. Moreover, we demonstrate how to leverage this reference to rapidly map new datasets, and to interpret immune responses to vaccination and COVID-19. Our approach represents a broadly applicable strategy to analyze single-cell multimodal datasets, including paired measurements of RNA and chromatin state, and to look beyond the transcriptome towards a unified and multimodal definition of cellular identity. Availability Installation instructions, documentation, tutorials, and CITE-seq datasets are available at http://www.satijalab.org/seurat


Related Papers

No related papers found

Powered by citation graph analysis