Transcriptional programs of neoantigen-specific TIL in anti-PD-1-treated lung cancersAbstract PD-1 blockade unleashes CD8 T cells 1 , including those specific for mutation-associated neoantigens (MANA), but factors in the tumour microenvironment can inhibit these T cell responses. Single-cell transcriptomics have revealed global T cell dysfunction programs in tumour-infiltrating lymphocytes (TIL). However, the majority of TIL do not recognize tumour antigens 2 , and little is known about transcriptional programs of MANA-specific TIL. Here, we identify MANA-specific T cell clones using the MANA functional expansion of specific T cells assay 3 in neoadjuvant anti-PD-1-treated non-small cell lung cancers (NSCLC). We use their T cell receptors as a ‘barcode’ to track and analyse their transcriptional programs in the tumour microenvironment using coupled single-cell RNA sequencing and T cell receptor sequencing. We find both MANA- and virus-specific clones in TIL, regardless of response, and MANA-, influenza- and Epstein–Barr virus-specific TIL each have unique transcriptional programs. Despite exposure to cognate antigen, MANA-specific TIL express an incompletely activated cytolytic program. MANA-specific CD8 T cells have hallmark transcriptional programs of tissue-resident memory (TRM) cells, but low levels of interleukin-7 receptor (IL-7R) and are functionally less responsive to interleukin-7 (IL-7) compared with influenza-specific TRM cells. Compared with those from responding tumours, MANA-specific clones from non-responding tumours express T cell receptors with markedly lower ligand-dependent signalling, are largely confined to HOBIT high TRM subsets, and coordinately upregulate checkpoints, killer inhibitory receptors and inhibitors of T cell activation. These findings provide important insights for overcoming resistance to PD-1 blockade.
A systematic evaluation of single-cell RNA-sequencing imputation methodsWenpin Hou, Zhicheng Ji, Hongkai Ji et al.|Genome biology|2020 BACKGROUND: The rapid development of single-cell RNA-sequencing (scRNA-seq) technologies has led to the emergence of many methods for removing systematic technical noises, including imputation methods, which aim to address the increased sparsity observed in single-cell data. Although many imputation methods have been developed, there is no consensus on how methods compare to each other. RESULTS: Here, we perform a systematic evaluation of 18 scRNA-seq imputation methods to assess their accuracy and usability. We benchmark these methods in terms of the similarity between imputed cell profiles and bulk samples and whether these methods recover relevant biological signals or introduce spurious noise in downstream differential expression, unsupervised clustering, and pseudotemporal trajectory analyses, as well as their computational run time, memory usage, and scalability. Methods are evaluated using data from both cell lines and tissues and from both plate- and droplet-based single-cell platforms. CONCLUSIONS: We found that the majority of scRNA-seq imputation methods outperformed no imputation in recovering gene expression observed in bulk RNA-seq. However, the majority of the methods did not improve performance in downstream analyses compared to no imputation, in particular for clustering and trajectory analysis, and thus should be used with caution. In addition, we found substantial variability in the performance of the methods within each evaluation aspect. Overall, MAGIC, kNN-smoothing, and SAVER were found to outperform the other methods most consistently.
Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysisWenpin Hou, Zhicheng Ji|Nature Methods|2024 Here we demonstrate that the large language model GPT-4 can accurately annotate cell types using marker gene information in single-cell RNA sequencing analysis. When evaluated across hundreds of tissue and cell types, GPT-4 generates cell type annotations exhibiting strong concordance with manual annotations. This capability can considerably reduce the effort and expertise required for cell type annotation. Additionally, we have developed an R software package GPTCelltype for GPT-4's automated cell type annotation.
Association of Cord Plasma Biomarkers of In Utero Acetaminophen Exposure With Risk of Attention-Deficit/Hyperactivity Disorder and Autism Spectrum Disorder in ChildhoodImportance: Prior studies have raised concern about maternal acetaminophen use during pregnancy and increased risk of attention-deficit/hyperactivity disorder (ADHD) and autism spectrum disorder (ASD) in their children; however, most studies have relied on maternal self-report. Objective: To examine the prospective associations between cord plasma acetaminophen metabolites and physician-diagnosed ADHD, ASD, both ADHD and ASD, and developmental disabilities (DDs) in childhood. Design, Setting, and Participants: This prospective cohort study analyzed 996 mother-infant dyads, a subset of the Boston Birth Cohort, who were enrolled at birth and followed up prospectively at the Boston Medical Center from October 1, 1998, to June 30, 2018. Exposures: Three cord acetaminophen metabolites (unchanged acetaminophen, acetaminophen glucuronide, and 3-[N-acetyl-l-cystein-S-yl]-acetaminophen) were measured in archived cord plasma samples collected at birth. Main Outcomes and Measures: Physician-diagnosed ADHD, ASD, and other DDs as documented in the child's medical records. Results: Of 996 participants (mean [SD] age, 9.8 [3.9] years; 548 [55.0%] male), the final sample included 257 children (25.8%) with ADHD only, 66 (6.6%) with ASD only, 42 (4.2%) with both ADHD and ASD, 304 (30.5%) with other DDs, and 327 (32.8%) who were neurotypical. Unchanged acetaminophen levels were detectable in all cord plasma samples. Compared with being in the first tertile, being in the second and third tertiles of cord acetaminophen burden was associated with higher odds of ADHD diagnosis (odds ratio [OR] for second tertile, 2.26; 95% CI, 1.40-3.69; OR for third tertile, 2.86; 95% CI, 1.77-4.67) and ASD diagnosis (OR for second tertile, 2.14; 95% CI, 0.93-5.13; OR for third tertile, 3.62; 95% CI, 1.62-8.60). Sensitivity analyses and subgroup analyses found consistent associations between acetaminophen buden and ADHD and acetaminophen burden and ASD across strata of potential confounders, including maternal indication, substance use, preterm birth, and child age and sex, for which point estimates for the ORs vary from 2.3 to 3.5 for ADHD and 1.6 to 4.1 for ASD. Conclusions and Relevance: Cord biomarkers of fetal exposure to acetaminophen were associated with significantly increased risk of childhood ADHD and ASD in a dose-response fashion. Our findings support previous studies regarding the association between prenatal and perinatal acetaminophen exposure and childhood neurodevelopmental risk and warrant additional investigations.
A statistical framework for differential pseudotime analysis with multiple single-cell RNA-seq samplesWenpin Hou, Zhicheng Ji, Zeyu Chen et al.|Nature Communications|2023 Pseudotime analysis with single-cell RNA-sequencing (scRNA-seq) data has been widely used to study dynamic gene regulatory programs along continuous biological processes. While many methods have been developed to infer the pseudotemporal trajectories of cells within a biological sample, it remains a challenge to compare pseudotemporal patterns with multiple samples (or replicates) across different experimental conditions. Here, we introduce Lamian, a comprehensive and statistically-rigorous computational framework for differential multi-sample pseudotime analysis. Lamian can be used to identify changes in a biological process associated with sample covariates, such as different biological conditions while adjusting for batch effects, and to detect changes in gene expression, cell density, and topology of a pseudotemporal trajectory. Unlike existing methods that ignore sample variability, Lamian draws statistical inference after accounting for cross-sample variability and hence substantially reduces sample-specific false discoveries that are not generalizable to new samples. Using both real scRNA-seq and simulation data, including an analysis of differential immune response programs between COVID-19 patients with different disease severity levels, we demonstrate the advantages of Lamian in decoding cellular gene expression programs in continuous biological processes.