HGC: fast hierarchical clustering for large-scale single-cell data
Abstract
SUMMARY: Clustering is a key step in revealing heterogeneities in single-cell data. Most existing single-cell clustering methods output a fixed number of clusters without the hierarchical information. Classical hierarchical clustering (HC) provides dendrograms of cells, but cannot scale to large datasets due to high computational complexity. We present HGC, a fast Hierarchical Graph-based Clustering tool to address both problems. It combines the advantages of graph-based clustering and HC. On the shared nearest-neighbor graph of cells, HGC constructs the hierarchical tree with linear time complexity. Experiments showed that HGC enables multiresolution exploration of the biological hierarchy underlying the data, achieves state-of-the-art accuracy on benchmark data and can scale to large datasets. AVAILABILITY AND IMPLEMENTATION: The R package of HGC is available at https://bioconductor.org/packages/HGC/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Related Papers
No related papers found
Powered by citation graph analysis