Integration and querying of multimodal single-cell data with PoE-VAE

Anastasia Litinetskaya(Helmholtz Zentrum München), Maiia Shulman, Fabiola Curion, Artur Szałata, Alireza Omidi, Mohammad Lotfollahi(Helmholtz Zentrum München), Fabian J. Theis(Helmholtz Zentrum München)
bioRxiv (Cold Spring Harbor Laboratory)
March 17, 2022
Cited by 64Open Access
Full Text

Abstract

Abstract Constructing joint representations from multimodal single-cell datasets is crucial for understanding cellular heterogeneity and function. Traditional methods, such as factor analysis and kNN-based approaches, face computational limitations with scalability across large datasets and multiple modalities. In this work, we demonstrate the product-of-experts VAE-based model, which offers a flexible, scalable solution for integrating multimodal data, allowing for the seamless mapping of both unimodal and multimodal queries onto a reference atlas. We evaluate how different strategies for combining modalities in the VAE framework impact query-to-reference mapping across diverse datasets, including CITE-seq and spatial metabolomics. Our benchmarks assess batch effect correction, biological signal preservation, and imputation of missing modalities. We showcase our approach in a mosaic setting, integrating CITE-seq and multiome data to accurately map unimodal and multimodal queries into the joint latent space. We extend this to spatial data by integrating gene expression and metabolomics from paired Visium and MALDI-MSI slides, achieving a high correlation in metabolite predictions from spatial gene expression. Our results demonstrate that this VAE-based framework is scalable, robust, and easily applicable across multiple modalities, providing a powerful tool for data imputation, querying, and biological discovery.


Related Papers

No related papers found

Powered by citation graph analysis