Structurally divergent and recurrently mutated regions of primate genomes

Yafei Mao(Shanghai Jiao Tong University), Yafei Mao(University of Washington), William T. Harvey(University of Washington), David Porubskỳ(University of Washington), Katherine M. Munson(University of Washington), Kendra Hoekzema(University of Washington), Alexandra P. Lewis(University of Washington), Peter A. Audano(University of Washington), Allison N. Rozanski(Shanghai Jiao Tong University), Xiangyu Yang(Shanghai Jiao Tong University), Shilong Zhang(Shanghai Jiao Tong University), DongAhn Yoo(Howard Hughes Medical Institute), David Gordon(Howard Hughes Medical Institute), Tyler Fair(University of California, San Francisco), Xiaoxi Wei(Shanghai Jiao Tong University), Glennis A. Logsdon(University of California, Santa Cruz), Marina Haukness(University of California, Santa Cruz), Philip C. Dishuck(University of Washington), Hyeonsoo Jeong(Broad Institute), Ricardo C.H. del Rosario(McGovern Institute for Brain Research), Vanessa L. Bauer(University of Colorado Boulder), Will T. Fattor(The University of Texas MD Anderson Cancer Center), Gregory K. Wilkerson(North Carolina State University), Yuxiang Mao(Shanghai Jiao Tong University), Yuxiang Mao(Chinese Academy of Sciences), Yongyong Shi(Shanghai Jiao Tong University), Qiang Sun(University of California, Santa Cruz), Qing Lü(Allen Institute for Brain Science), Benedict Paten(University of California, San Francisco), Trygve E. Bakken(Broad Institute), Alex A. Pollen(University of Colorado Boulder), Guoping Feng(McGovern Institute for Brain Research), Sara L. Sawyer(Oregon National Primate Research Center), Wesley C. Warren(Howard Hughes Medical Institute), Lucia Carbone(Oregon Health & Science University), Evan E. Eichler(Howard Hughes Medical Institute)
Cell
February 29, 2024
Cited by 72Open Access
Full Text

Abstract

We sequenced and assembled using multiple long-read sequencing technologies the genomes of chimpanzee, bonobo, gorilla, orangutan, gibbon, macaque, owl monkey, and marmoset. We identified 1,338,997 lineage-specific fixed structural variants (SVs) disrupting 1,561 protein-coding genes and 136,932 regulatory elements, including the most complete set of human-specific fixed differences. We estimate that 819.47 Mbp or ∼27% of the genome has been affected by SVs across primate evolution. We identify 1,607 structurally divergent regions wherein recurrent structural variation contributes to creating SV hotspots where genes are recurrently lost (e.g., CARD, C4, and OLAH gene families) and additional lineage-specific genes are generated (e.g., CKAP2, VPS36, ACBD7, and NEK5 paralogs), becoming targets of rapid chromosomal diversification and positive selection (e.g., RGPD gene family). High-fidelity long-read sequencing has made these dynamic regions of the genome accessible for sequence-level analyses within and between primate species.


Related Papers

No related papers found

Powered by citation graph analysis