HGC: fast hierarchical clustering for large-scale single-cell data

Ziheng Zou(Tsinghua University), Kui Hua(Tsinghua University), Xuegong Zhang(Tsinghua University)
Bioinformatics
June 4, 2021
Cited by 27

Abstract

SUMMARY: Clustering is a key step in revealing heterogeneities in single-cell data. Most existing single-cell clustering methods output a fixed number of clusters without the hierarchical information. Classical hierarchical clustering (HC) provides dendrograms of cells, but cannot scale to large datasets due to high computational complexity. We present HGC, a fast Hierarchical Graph-based Clustering tool to address both problems. It combines the advantages of graph-based clustering and HC. On the shared nearest-neighbor graph of cells, HGC constructs the hierarchical tree with linear time complexity. Experiments showed that HGC enables multiresolution exploration of the biological hierarchy underlying the data, achieves state-of-the-art accuracy on benchmark data and can scale to large datasets. AVAILABILITY AND IMPLEMENTATION: The R package of HGC is available at https://bioconductor.org/packages/HGC/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Related Papers

No related papers found

Powered by citation graph analysis