scvi-tools: a library for deep probabilistic analysis of single-cell omics data

Adam Gayoso(University of California, Berkeley), Romain Lopez(University of California, Berkeley), Galen Xing(Chan Zuckerberg Initiative (United States)), Pierre Boyeau(École Normale Supérieure Paris-Saclay), K. Wu(University of California, Berkeley), Michael Jayasuriya(University of California, Berkeley), Edouard Melhman(École Normale Supérieure Paris-Saclay), Maxime Langevin(École Polytechnique), Yining Liu(University of California, Berkeley), Jules Samaran(Université Paris Sciences et Lettres), Gabriel Misrachi(École Polytechnique), Achille Nazaret(École Polytechnique), Oscar Clivio(École Normale Supérieure Paris-Saclay), Chenling Xu(University of California, Berkeley), Tal Ashuach(University of California, Berkeley), Mohammad Lotfollahi(Helmholtz Zentrum München), Valentine Svensson(Q Therapeutics (United States)), Eduardo da Veiga Beltrame(California Institute of Technology), Carlos Talavera‐López(European Bioinformatics Institute), Lior Pachter(California Institute of Technology), Fabian J. Theis(Helmholtz Zentrum München), Aaron Streets(Chan Zuckerberg Initiative (United States)), Michael I. Jordan(University of California, Berkeley), Jeffrey Regier(University of Michigan), Nir Yosef(Ragon Institute of MGH, MIT and Harvard)
bioRxiv (Cold Spring Harbor Laboratory)
April 29, 2021
Cited by 72Open Access
Full Text

Abstract

A bstract Probabilistic models have provided the underpinnings for state-of-the-art performance in many single-cell omics data analysis tasks, including dimensionality reduction, clustering, differential expression, annotation, removal of unwanted variation, and integration across modalities. Many of the models being deployed are amenable to scalable stochastic inference techniques, and accordingly they are able to process single-cell datasets of realistic and growing sizes. However, the community-wide adoption of probabilistic approaches is hindered by a fractured software ecosystem resulting in an array of packages with distinct, and often complex interfaces. To address this issue, we developed scvi-tools ( https://scvi-tools.org ), a Python package that implements a variety of leading probabilistic methods. These methods, which cover many fundamental analysis tasks, are accessible through a standardized, easy-to-use interface with direct links to Scanpy, Seurat, and Bioconductor workflows. By standardizing the implementations, we were able to develop and reuse novel functionalities across different models, such as support for complex study designs through nonlinear removal of unwanted variation due to multiple covariates and reference-query integration via scArches. The extensible software building blocks that underlie scvi-tools also enable a developer environment in which new probabilistic models for single cell omics can be efficiently developed, benchmarked, and deployed. We demonstrate this through a code-efficient reimplementation of Stereoscope for deconvolution of spatial transcriptomics profiles. By catering to both the end user and developer audiences, we expect scvi-tools to become an essential software dependency and serve to formulate a community standard for probabilistic modeling of single cell omics.


Related Papers

No related papers found

Powered by citation graph analysis