Single duplex DNA sequencing with CODEC detects mutations with high sensitivity

Jin H. Bae(Broad Institute), Ruolin Liu(Broad Institute), Eugenia Roberts(Broad Institute), Erica Nguyen(Broad Institute), Shervin Tabrizi(Broad Institute), Justin Rhoades(Broad Institute), Timothy Blewett(Broad Institute), Kan Xiong(Broad Institute), Gregory Gydush(Broad Institute), Douglas Shea(Broad Institute), Zhenyi An(Broad Institute), Sahil Patel(Broad Institute), Ju Cheng(Broad Institute), Sainetra Sridhar(Broad Institute), Mei Hong Liu(Center for Human Genetics), Emilie Lassen, Anne‐Bine Skytte, Marta Grońska-Pęski(Center for Human Genetics), Jonathan E. Shoag(University Hospitals of Cleveland), Gilad D. Evrony(Center for Human Genetics), Heather A. Parsons(Dana-Farber Cancer Institute), Erica L. Mayer(Dana-Farber Cancer Institute), G. Mike Makrigiorgos(Dana-Farber Cancer Institute), Todd R. Golub(Broad Institute), Viktor A. Adalsteinsson(Broad Institute)
Nature Genetics
April 27, 2023
Cited by 81Open Access
Full Text

Abstract

Abstract Detecting mutations from single DNA molecules is crucial in many fields but challenging. Next-generation sequencing (NGS) affords tremendous throughput but cannot directly sequence double-stranded DNA molecules (‘single duplexes’) to discern the true mutations on both strands. Here we present Concatenating Original Duplex for Error Correction (CODEC), which confers single duplex resolution to NGS. CODEC affords 1,000-fold higher accuracy than NGS, using up to 100-fold fewer reads than duplex sequencing. CODEC revealed mutation frequencies of 2.72 × 10 −8 in sperm of a 39-year-old individual, and somatic mutations acquired with age in blood cells. CODEC detected genome-wide, clonal hematopoiesis mutations from single DNA molecules, single mutated duplexes from tumor genomes and liquid biopsies, microsatellite instability with 10-fold greater sensitivity and mutational signatures, and specific tumor mutations with up to 100-fold fewer reads. CODEC enables more precise genetic testing and reveals biologically significant mutations, which are commonly obscured by NGS errors.


Related Papers