Kyle J. Travaglini

A molecular cell atlas of the human lung from single-cell RNA sequencing

Kyle J. Travaglini, Ahmad N. Nabhan, Lolita Penland et al.|Nature|2020

Cited by 1.8kOpen Access

An integrated cell atlas of the lung in health and disease

Lisa Sikkema, Ciro Ramírez-Suástegui, Daniel Strobl et al.|Nature Medicine|2023

Cited by 739Open Access

Abstract Single-cell technologies have transformed our understanding of human tissues. Yet, studies typically capture only a limited number of donors and disagree on cell type definitions. Integrating many single-cell datasets can address these limitations of individual studies and capture the variability present in the population. Here we present the integrated Human Lung Cell Atlas (HLCA), combining 49 datasets of the human respiratory system into a single atlas spanning over 2.4 million cells from 486 individuals. The HLCA presents a consensus cell type re-annotation with matching marker genes, including annotations of rare and previously undescribed cell types. Leveraging the number and diversity of individuals in the HLCA, we identify gene modules that are associated with demographic covariates such as age, sex and body mass index, as well as gene modules changing expression along the proximal-to-distal axis of the bronchial tree. Mapping new data to the HLCA enables rapid data annotation and interpretation. Using the HLCA as a reference for the study of disease, we identify shared cell states across multiple lung diseases, including SPP1 + profibrotic monocyte-derived macrophages in COVID-19, pulmonary fibrosis and lung carcinoma. Overall, the HLCA serves as an example for the development and use of large-scale, cross-dataset organ atlases within the Human Cell Atlas.

Capillary cell-type specialization in the alveolus

Astrid Gillich, Fan Zhang, Colleen G. Farmer et al.|Nature|2020

Cited by 471Open Access

Index switching causes “spreading-of-signal” among multiplexed samples in Illumina HiSeq 4000 DNA sequencing

Rahul Sinha, Geoff Stanley, Gunsagar S. Gulati et al.|bioRxiv (Cold Spring Harbor Laboratory)|2017

Cited by 240Open Access

Abstract Illumina-based next generation sequencing (NGS) has accelerated biomedical discovery through its ability to generate thousands of gigabases of sequencing output per run at a fraction of the time and cost of conventional technologies. The process typically involves four basic steps: library preparation, cluster generation, sequencing, and data analysis. In 2015, a new chemistry of cluster generation was introduced in the newer Illumina machines (HiSeq 3000/4000/X Ten) called exclusion amplification (ExAmp), which was a fundamental shift from the earlier method of random cluster generation by bridge amplification on a non-patterned flow cell. The ExAmp chemistry, in conjunction with patterned flow cells containing nanowells at fixed locations, increases cluster density on the flow cell, thereby reducing the cost per run. It also increases sequence read quality, especially for longer read lengths (up to 150 base pairs). This advance has been widely adopted for genome sequencing because greater sequencing depth can be achieved for lower cost without compromising the quality of longer reads. We show that this promising chemistry is problematic, however, when multiplexing samples. We discovered that up to 5-10% of sequencing reads (or signals) are incorrectly assigned from a given sample to other samples in a multiplexed pool. We provide evidence that this “spreading-of-signals” arises from low levels of free index primers present in the pool. These index primers can prime pooled library fragments at random via complementary 3’ ends, and get extended by DNA polymerase, creating a new library molecule with a new index before binding to the patterned flow cell to generate a cluster for sequencing. This causes the resulting read from that cluster to be assigned to a different sample, causing the spread of signals within multiplexed samples. We show that low levels of free index primers persist after the most common library purification procedure recommended by Illumina, and that the amount of signal spreading among samples is proportional to the level of free index primer present in the library pool. This artifact causes homogenization and misclassification of cells in single cell RNA-seq experiments. Therefore, all data generated in this way must now be carefully re-examined to ensure that “spreading-of-signals” has not compromised data analysis and conclusions. Re-sequencing samples using an older technology that uses conventional bridge amplification for cluster generation, or improved library cleanup strategies to remove free index primers, can minimize or eliminate this signal spreading artifact.

Integrated multimodal cell atlas of Alzheimer’s disease

Mariano I. Gabitto, Kyle J. Travaglini, Victoria M. Rachleff et al.|Nature Neuroscience|2024

Cited by 240Open Access

Abstract Alzheimer’s disease (AD) is the leading cause of dementia in older adults. Although AD progression is characterized by stereotyped accumulation of proteinopathies, the affected cellular populations remain understudied. Here we use multiomics, spatial genomics and reference atlases from the BRAIN Initiative to study middle temporal gyrus cell types in 84 donors with varying AD pathologies. This cohort includes 33 male donors and 51 female donors, with an average age at time of death of 88 years. We used quantitative neuropathology to place donors along a disease pseudoprogression score. Pseudoprogression analysis revealed two disease phases: an early phase with a slow increase in pathology, presence of inflammatory microglia, reactive astrocytes, loss of somatostatin + inhibitory neurons, and a remyelination response by oligodendrocyte precursor cells; and a later phase with exponential increase in pathology, loss of excitatory neurons and Pvalb + and Vip + inhibitory neuron subtypes. These findings were replicated in other major AD studies.

Kyle J. Travaglini

Is this you? Claim your profile.

Top publicationsby citations