<i>MapMyCells:</i> High-performance mapping of unlabeled cell-by-gene data to reference brain taxonomies

Scott Daniel(Allen Institute), Changkyu Lee(Allen Institute), Tyler Mollenkopf(Allen Institute), M. H. Lee(Broad Institute), Joel Arbuckle(Allen Institute), Elysha Fiabane(Allen Institute), Mariano I. Gabitto(Allen Institute), Nelson Johansen(Allen Institute), Inkar Kapen(Allen Institute), Andrew W. Kraft(Broad Institute), Jane Lai(Allen Institute), Su Ying Li(Allen Institute), Ryan McGinty(Allen Institute), Jeremy A. Miller(Allen Institute), Skyler Welch-Moosman(Allen Institute), Sven Otto(Allen Institute), Lane Sawyer(Allen Institute), Noah Shepard(Allen Institute), Carol L. Thompson(Allen Institute), Andreas Tjärnberg(Allen Institute), Jack Waters(Allen Institute), Xingjian Zhen(Allen Institute), Evan Z. Macosko(Broad Institute), Ed S. Lein(Allen Institute), Lydia Ng(Allen Institute), Hongkui Zeng(Allen Institute), Shoaib Mufti(Allen Institute), Zizhen Yao(Allen Institute), Michael J. Hawrylycz(Allen Institute)
bioRxiv (Cold Spring Harbor Laboratory)
March 9, 2026
Cited by 3Open Access
Full Text

Abstract

Abstract Single-cell mapping methods convert raw, heterogeneous single-cell datasets into interpretable and comparable representations of biological identity. As reference cell-type taxonomies mature, mapping new datasets to shared references has become a central strategy for enabling cross-study integration, reproducible annotation, and cumulative biological knowledge. Here we present MapMyCells , an open-source framework designed to align diverse single-cell omics datasets to hierarchical reference taxonomies with minimal preprocessing. MapMyCells provides out-of-the-box support for an expanding set of high-quality brain cell-type references generated by the Allen Institute for Brain Science, the BRAIN Initiative, and the Seattle Alzheimer’s Disease Brain Cell Atlas, including whole-brain mouse and human atlases, aging and Alzheimer’s disease cohorts, and a cross-species consensus taxonomy initially focused on the basal ganglia. MapMyCells enables efficient mapping of hundreds of thousands of cells on standard workstations without specialized hardware, providing a deterministic, scalable, and modality-agnostic approach that is robust across species and molecular assays. The framework produces interpretable confidence metrics and quantitative summaries of mapping performance, allowing users to evaluate assignment precision and accuracy. We demonstrate the mapping of unlabeled transcriptomic, epigenomic, and spatial datasets to reference taxonomies and describe a general workflow for preparing arbitrary hierarchical taxonomies for reference-based mapping. As the ecosystem of single-cell reference atlases expands, MapMyCells offers a practical and reproducible solution for community-scale cell-type annotation and cross-dataset integration, supporting the development of unified and extensible brain cell atlases.


Related Papers

No related papers found

Powered by citation graph analysis