A systematic benchmark of Nanopore long read RNA sequencing for transcript level analysis in human cell lines

Ying Chen(Genome Institute of Singapore), N. Davidson(The University of Melbourne), Yuk Kei Wan(Genome Institute of Singapore), Harshil Patel(The Francis Crick Institute), Fei Yao(Genome Institute of Singapore), Hwee Meng Low(Genome Institute of Singapore), Christopher Hendra(National University of Singapore), Laura Watten(Genome Institute of Singapore), Andre Sim(Genome Institute of Singapore), Chelsea Sawyer(The Francis Crick Institute), Viktoriia Iakovleva(National University of Singapore), Puay Leng Lee(Genome Institute of Singapore), Lixia Xin(Genome Institute of Singapore), Hui En Vanessa Ng(National University of Singapore), Jia Min Loo(Genome Institute of Singapore), Xuewen Ong(Duke-NUS Medical School), Hui Qi Amanda Ng(National University of Singapore), Jiaxu Wang(Genome Institute of Singapore), Wei Qian Casslynn Koh(Genome Institute of Singapore), Suk Yeah Polly Poon(Genome Institute of Singapore), Dominik Stanojević(University of Zagreb), Hoang-Dai Tran(Genome Institute of Singapore), Kok Hao Edwin Lim(Genome Institute of Singapore), Shen Yon Toh(National Cancer Centre Singapore), Philip Ewels(Stockholm University), Huck‐Hui Ng(National University of Singapore), N. Gopalakrishna Iyer(Duke-NUS Medical School), Alexandre H. Thiéry(National University of Singapore), Wee Joo Chng(National University of Singapore), Leilei Chen(National University of Singapore), Ramanuj DasGupta(Genome Institute of Singapore), Mile Šikić(University of Zagreb), Yun-Shen Chan(Genome Institute of Singapore), Boon Ooi Patrick Tan(National University of Singapore), Yue Wan(Genome Institute of Singapore), Wai Leong Tam(National University of Singapore), Qiang Yu(Genome Institute of Singapore), Chiea Chuan Khor(Singapore Eye Research Institute), Torsten Wüstefeld(National University Cancer Institute, Singapore), Ploy N. Pratanwanich(Chulalongkorn University), Michael I. Love(University of North Carolina at Chapel Hill), W.S. Sho Goh(Shenzhen Bay Laboratory), Sarah Ng(Genome Institute of Singapore), Alicia Oshlack(The University of Melbourne), Jonathan Göke(National Cancer Centre Singapore)
bioRxiv (Cold Spring Harbor Laboratory)
April 22, 2021
Cited by 114Open Access
Full Text

Abstract

Abstract The human genome contains more than 200,000 gene isoforms. However, different isoforms can be highly similar, and with an average length of 1.5kb remain difficult to study with short read sequencing. To systematically evaluate the ability to study the transcriptome at a resolution of individual isoforms we profiled 5 human cell lines with short read cDNA sequencing and Nanopore long read direct RNA, amplification-free direct cDNA, PCR-cDNA sequencing. The long read protocols showed a high level of consistency, with amplification-free RNA and cDNA sequencing being most similar. While short and long reads generated comparable gene expression estimates, they differed substantially for individual isoforms. We find that increased read length improves read-to-transcript assignment, identifies interactions between alternative promoters and splicing, enables the discovery of novel transcripts from repetitive regions, facilitates the quantification of full-length fusion isoforms and enables the simultaneous profiling of m6A RNA modifications when RNA is sequenced directly. Our study demonstrates the advantage of long read RNA sequencing and provides a comprehensive resource that will enable the development and benchmarking of computational methods for profiling complex transcriptional events at isoform-level resolution.


Related Papers

No related papers found

Powered by citation graph analysis