FlowGrid enables fast clustering of very large single-cell RNA-seq data

Xiunan Fang(Chinese University of Hong Kong), Joshua W. K. Ho(Chinese University of Hong Kong)
Bioinformatics
July 18, 2021
Cited by 7

Abstract

MOTIVATION: Scalable clustering algorithms are needed to analyze millions of cells in single cell RNA-seq (scRNA-seq) data. RESULTS: Here, we present an open source python package called FlowGrid that can integrate into the Scanpy workflow to perform clustering on very large scRNA-seq datasets. FlowGrid implements a fast density-based clustering algorithm originally designed for flow cytometry data analysis. We introduce a new automated parameter tuning procedure, and show that FlowGrid can achieve comparable clustering accuracy as state-of-the-art clustering algorithms but at a substantially reduced run time for very large single cell RNA-seq datasets. For example, FlowGrid can complete a one-hour clustering task for one million cells in about five min. AVAILABILITY AND IMPLEMENTATION: https://github.com/holab-hku/FlowGrid. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Related Papers

No related papers found

Powered by citation graph analysis