ESPRESSO: Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq dataYuan Gao, Feng Wang, Robert Wang et al.|Science Advances|2023 Long-read RNA sequencing (RNA-seq) holds great potential for characterizing transcriptome variation and full-length transcript isoforms, but the relatively high error rate of current long-read sequencing platforms poses a major challenge. We present ESPRESSO, a computational tool for robust discovery and quantification of transcript isoforms from error-prone long reads. ESPRESSO jointly considers alignments of all long reads aligned to a gene and uses error profiles of individual reads to improve the identification of splice junctions and the discovery of their corresponding transcript isoforms. On both a synthetic spike-in RNA sample and human RNA samples, ESPRESSO outperforms multiple contemporary tools in not only transcript isoform discovery but also transcript isoform quantification. In total, we generated and analyzed ~1.1 billion nanopore RNA-seq reads covering 30 human tissue samples and three human cell lines. ESPRESSO and its companion dataset provide a useful resource for studying the RNA repertoire of eukaryotic transcriptomes.
Long-read RNA sequencing: A transformative technology for exploring transcriptome complexity in human diseasesLong-read RNA sequencing (RNA-seq) is emerging as a powerful and versatile technology for studying human transcriptomes. By enabling the end-to-end sequencing of full-length transcripts, long-read RNA-seq opens up avenues for investigating various RNA species and features that cannot be reliably interrogated by standard short-read RNA-seq methods. In this review, we present an overview of long-read RNA-seq, delineating its strengths over short-read RNA-seq, as well as summarizing recent advances in experimental and computational approaches to boost the power of long-read-based transcriptomics. We describe a wide range of applications of long-read RNA-seq, and highlight its expanding role as a foundational technology for exploring transcriptome variations in human diseases.
TEQUILA-seq: a versatile and low-cost method for targeted long-read RNA sequencingFeng Wang, Yang Xu, Robert Wang et al.|Nature Communications|2023 Long-read RNA sequencing (RNA-seq) is a powerful technology for transcriptome analysis, but the relatively low throughput of current long-read sequencing platforms limits transcript coverage. One strategy for overcoming this bottleneck is targeted long-read RNA-seq for preselected gene panels. We present TEQUILA-seq, a versatile, easy-to-implement, and low-cost method for targeted long-read RNA-seq utilizing isothermally linear-amplified capture probes. When performed on the Oxford nanopore platform with multiple gene panels of varying sizes, TEQUILA-seq consistently and substantially enriches transcript coverage while preserving transcript quantification. We profile full-length transcript isoforms of 468 actionable cancer genes across 40 representative breast cancer cell lines. We identify transcript isoforms enriched in specific subtypes and discover novel transcript isoforms in extensively studied cancer genes such as TP53. Among cancer genes, tumor suppressor genes (TSGs) are significantly enriched for aberrant transcript isoforms targeted for degradation via mRNA nonsense-mediated decay, revealing a common RNA-associated mechanism for TSG inactivation. TEQUILA-seq reduces the per-reaction cost of targeted capture by 2-3 orders of magnitude, as compared to a standard commercial solution. TEQUILA-seq can be broadly used for targeted sequencing of full-length transcripts in diverse biomedical research settings.
Functional analysis of ESRP1/2 gene variants and CTNND1 isoforms in orofacial cleft pathogenesisOrofacial cleft (OFC) is a common human congenital anomaly. Epithelial-specific RNA splicing regulators ESRP1 and ESRP2 regulate craniofacial morphogenesis and their disruption result in OFC in zebrafish, mouse and humans. Using esrp1/2 mutant zebrafish and murine Py2T cell line models, we functionally tested the pathogenicity of human ESRP1/2 gene variants. We found that many variants predicted by in silico methods to be pathogenic were functionally benign. Esrp1 also regulates the alternative splicing of Ctnnd1 and these genes are co-expressed in the embryonic and oral epithelium. In fact, over-expression of ctnnd1 is sufficient to rescue morphogenesis of epithelial-derived structures in esrp1/2 zebrafish mutants. Additionally, we identified 13 CTNND1 variants from genome sequencing of OFC cohorts, confirming CTNND1 as a key gene in human OFC. This work highlights the importance of functional assessment of human gene variants and demonstrates the critical requirement of Esrp-Ctnnd1 acting in the embryonic epithelium to regulate palatogenesis.
Isoform characterization of m6A in single cells identifies its role in RNA surveillanceZhijun Ren, Jialiang He, Xiang Huang et al.|Nature Communications|2025 The distribution of m6A across various RNA isoforms and its heterogeneity within single cells are still not well understood. Here, we develop m6A-isoSC-seq, which employs both Oxford Nanopore long-read and Illumina short-read sequencing on the same 10x Genomics single-cell cDNA library with APOBEC1-YTH induced C-to-U mutations near m6A sites. Through m6A-isoSC-seq on a pooled sample of three cell line origins, we unveil a profound degree of m6A heterogeneity at both the isoform and single-cell levels. Through comparisons across single cells, we identify widespread specific m6A methylation on certain RNA isoforms, usually those misprocessed RNA isoforms. Compared to the coding isoforms of the same genes, the expression of highly methylated misprocessed RNA isoforms is more sensitive to METTL3 depletion. These misprocessed RNAs tend to have excessive m6A sites in coding regions, which are targets of CDS-m6A decay (CMD). This study offers undocumented insights into the role of m6A in RNA surveillance. The heterogeneity of isoform level m6A RNA methylation in single cells is unclear. The authors characterize m6A at both single-cell and isoform level through ONT long-read sequencing on single-cell cDNA library with APOBEC1-YTH induced C-to-U mutations. They find the role of m6A on surveillance of misprocessed RNAs through CDS-m6A decay mechanism.