Y

Yun Yan

The University of Texas MD Anderson Cancer Center

ORCID: 0000-0002-3701-9608

Publishes on Cancer Genomics and Diagnostics, Single-cell and spatial transcriptomics, Epigenetics and DNA Methylation. 35 papers and 2.4k citations.

35Publications
2.4kTotal Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

K-nearest neighbor smoothing for high-throughput single-cell RNA-Seq data
Florian Wagner, Yun Yan, Itai Yanai|bioRxiv (Cold Spring Harbor Laboratory)|2017
Cited by 159Open Access

High-throughput single-cell RNA-Seq (scRNA-Seq) is a powerful approach for studying heterogeneous tissues and dynamic cellular processes. However, compared to bulk RNA-Seq, single-cell expression profiles are extremely noisy, as they only capture a fraction of the transcripts present in the cell. Here, we propose the k-nearest neighbor smoothing (kNN-smoothing) algorithm, designed to reduce noise by aggregating information from similar cells (neighbors) in a computationally efficient and statistically tractable manner. The algorithm is based on the observation that across protocols, the technical noise exhibited by UMI-filtered scRNA-Seq data closely follows Poisson statistics. Smoothing is performed by first identifying the nearest neighbors of each cell in a step-wise fashion, based on partially smoothed and variance-stabilized expression profiles, and then aggregating their transcript counts. We show that kNN-smoothing greatly improves the detection of clusters of cells and co-expressed genes, and clearly outperforms other smoothing methods on simulated data. To accurately perform smoothing for datasets containing highly similar cell populations, we propose the kNN-smoothing 2 algorithm, in which neighbors are determined after projecting the partially smoothed data onto the first few principal components. We show that unlike its predecessor, kNN-smoothing 2 can accurately distinguish between cells from different T cell subsets, and enables their identification in peripheral blood using unsupervised methods. Our work facilitates the analysis of scRNA-Seq data across a broad range of applications, including the identification of cell populations in heterogeneous tissues and the characterization of dynamic processes such as cellular differentiation. Reference implementations of our algorithms can be found at https://github.com/yanailab/knn-smoothing.

Differential chromatin binding of the lung lineage transcription factor NKX2-1 resolves opposing murine alveolar cell fates in vivo
Danielle R. Little, Anne M. Lynch, Yun Yan et al.|Nature Communications|2021
Cited by 115Open Access

Differential transcription of identical DNA sequences leads to distinct tissue lineages and then multiple cell types within a lineage, an epigenetic process central to progenitor and stem cell biology. The associated genome-wide changes, especially in native tissues, remain insufficiently understood, and are hereby addressed in the mouse lung, where the same lineage transcription factor NKX2-1 promotes the diametrically opposed alveolar type 1 (AT1) and AT2 cell fates. Here, we report that the cell-type-specific function of NKX2-1 is attributed to its differential chromatin binding that is acquired or retained during development in coordination with partner transcriptional factors. Loss of YAP/TAZ redirects NKX2-1 from its AT1-specific to AT2-specific binding sites, leading to transcriptionally exaggerated AT2 cells when deleted in progenitors or AT1-to-AT2 conversion when deleted after fate commitment. Nkx2-1 mutant AT1 and AT2 cells gain distinct chromatin accessible sites, including those specific to the opposite fate while adopting a gastrointestinal fate, suggesting an epigenetic plasticity unexpected from transcriptional changes. Our genomic analysis of single or purified cells, coupled with precision genetics, provides an epigenetic basis for alveolar cell fate and potential, and introduces an experimental benchmark for deciphering the in vivo function of lineage transcription factors.