Human-specific tandem repeat expansion and differential gene expression during primate evolution

Arvis Sulovari(University of Washington), Ruiyang Li(University of Washington), Peter A. Audano(University of Washington), David Porubskỳ(University of Washington), Mitchell R. Vollger(University of Washington), Glennis A. Logsdon(University of Washington), Wesley C. Warren(University of Missouri), Alex A. Pollen(University of California, San Francisco), Mark Chaisson(University of Southern California), Evan E. Eichler(Howard Hughes Medical Institute), Mark Chaisson(University of Southern California), Ashley D. Sanders, Xuefang Zhao, Ankit Malhotra, David Porubskỳ(University of Washington), Tobias Rausch, Eugene J. Gardner, Oscar L. Rodriguez, Li Guo, Ryan L. Collins, Xian Fan, Jia Wen, Robert E. Handsaker, Susan Fairley, Zev Kronenberg, Xiangmeng Kong, Fereydoun Hormozdiari, Dillon Lee, Aaron M. Wenger, Alex Hastie, Danny Antaki, Thomas Anantharaman, Peter A. Audano(University of Washington), Harrison Brand, Stuart Cantsilieris, Han Cao, Eliza Cerveira, Chong Chen, Xintong Chen, Chen-Shan Chin, Zechen Chong, Nelson T. Chuang, Christine Lambert, Deanna M. Church, Laura Clarke, Andrew Farrell, Joey Flores, Timur Galeey, David U. Gorkin, Madhusudan Gujral, Victor Guryev, Haynes Heaton, Jonas Korlach, Sushant Kumar, Jee Young Kwon, Ernest T. Lam, Jong Eun Lee, Joyce Lee, Wan‐Ping Lee, Sau Peng Lee, Shantao Li, Patrick Marks, Karine A. Viaud-Martinez, Sascha Meiers, Katherine M. Munson, Fábio C. P. Navarro, Bradley J. Nelson, Conor Nodzak, Amina Noor, Sofia Kyriazopoulou-Panagiotopoulou, Andy Wing Chun Pang, Yunjiang Qiu, Gabriel Rosanio, Mallory Ryan, Adrian M. Stütz, Diana C.J. Spierings, Alistair Ward, AnneMarie E. Welch, Ming Xiao, Wei Xu, Chengsheng Zhang, Qihui Zhu, Xiangqun Zheng-Bradley, Ernesto Lowy, Sergei Yakneen, Steven A. McCarroll, Goo Jun, Li Ding, Chong‐Lek Koh, Bing Ren, Paul Flicek, Ken Chen, Mark Gerstein, Pui–Yan Kwok, Peter M. Lansdorp, Gábor Marth, Jonathan Sebat, Xinghua Shi, Ali Bashir, Kai Ye, Scott E. Devine, Michael E. Talkowski, Ryan E. Mills, Tobias Marschall, Jan O. Korbel, Evan E. Eichler(Howard Hughes Medical Institute), Charles Lee
Proceedings of the National Academy of Sciences
October 28, 2019
Cited by 158Open Access
Full Text

Abstract

Short tandem repeats (STRs) and variable number tandem repeats (VNTRs) are important sources of natural and disease-causing variation, yet they have been problematic to resolve in reference genomes and genotype with short-read technology. We created a framework to model the evolution and instability of STRs and VNTRs in apes. We phased and assembled 3 ape genomes (chimpanzee, gorilla, and orangutan) using long-read and 10x Genomics linked-read sequence data for 21,442 human tandem repeats discovered in 6 haplotype-resolved assemblies of Yoruban, Chinese, and Puerto Rican origin. We define a set of 1,584 STRs/VNTRs expanded specifically in humans, including large tandem repeats affecting coding and noncoding portions of genes (e.g., MUC3A , CACNA1C ). We show that short interspersed nuclear element–VNTR– Alu (SVA) retrotransposition is the main mechanism for distributing GC-rich human-specific tandem repeat expansions throughout the genome but with a bias against genes. In contrast, we observe that VNTRs not originating from retrotransposons have a propensity to cluster near genes, especially in the subtelomere. Using tissue-specific expression from human and chimpanzee brains, we identify genes where transcript isoform usage differs significantly, likely caused by cryptic splicing variation within VNTRs. Using single-cell expression from cerebral organoids, we observe a strong effect for genes associated with transcription profiles analogous to intermediate progenitor cells. Finally, we compare the sequence composition of some of the largest human-specific repeat expansions and identify 52 STRs/VNTRs with at least 40 uninterrupted pure tracts as candidates for genetically unstable regions associated with disease.


Related Papers