quarTeT: a telomere-to-telomere toolkit for gap-free genome assembly and centromeric repeat identification

Yunzhi Lin(Anhui Agricultural University), Chen Ye(Anhui Agricultural University), Xingzhu Li(Anhui Agricultural University), Qinyao Chen(Anhui Agricultural University), Ying Wu(Anhui Agricultural University), Feng Zhang(Anhui Agricultural University), Rui Pan(Anhui Agricultural University), Sijia Zhang(Anhui Agricultural University), Shuxia Chen(Anhui Agricultural University), Xu Wang(Agricultural Genomics Institute at Shenzhen), Shuo Cao(Huazhong Agricultural University), Yingzhen Wang(Anhui Agricultural University), Yi Yue(Anhui Agricultural University), Yongsheng Liu(Anhui Agricultural University), Junyang Yue(Anhui Agricultural University)
Horticulture Research
June 13, 2023
Cited by 285Open Access
Full Text

Abstract

Abstract A high-quality genome is the basis for studies on functional, evolutionary, and comparative genomics. The majority of attention has been paid to the solution of complex chromosome structures and highly repetitive sequences, along with the emergence of a new ‘telomere-to-telomere (T2T) assembly’ era. However, the bioinformatic tools for the automatic construction and/or characterization of T2T genome are limited. Here, we developed a user-friendly web toolkit, quarTeT, which currently includes four modules: AssemblyMapper, GapFiller, TeloExplorer, and CentroMiner. First, AssemblyMapper is designed to assemble phased contigs into the chromosome-level genome by referring to a closely related genome. Then, GapFiller would endeavor to fill all unclosed gaps in a given genome with the aid of additional ultra-long sequences. Finally, TeloExplorer and CentroMiner are applied to identify candidate telomere and centromere as well as their localizations on each chromosome. These four modules can be used alone or in combination with each other for T2T genome assembly and characterization. As a case study, by adopting the entire modular functions of quarTeT, we have achieved the Actinidia chinensis genome assembly that is of a quality comparable to the reported genome Hongyang v4.0, which was assembled with the addition of manual handling. Further evaluation of CentroMiner by searching centromeres in Arabidopsis thaliana and Oryza sativa genomes showed that quarTeT is capable of identifying all the centromeric regions that have been previously detected by experimental methods. Collectively, quarTeT is an efficient toolkit for studies of large-scale T2T genomes and can be accessed at http://www.atcgn.com:8080/quarTeT/home.html without registration.


Related Papers

No related papers found

Powered by citation graph analysis