Platanus-allee is a de novo haplotype assembler enabling a comprehensive access to divergent heterozygous regions

Rei Kajitani(Tokyo Institute of Technology), Dai Yoshimura(Tokyo Institute of Technology), Miki Okuno(Tokyo Institute of Technology), Yohei Minakuchi(National Institute of Genetics), Hiroshi Kagoshima(National Institute of Genetics), Asao Fujiyama(National Institute of Genetics), Kaoru Kubokawa(The University of Tokyo), Yuji Kohara(National Institute of Genetics), Atsushi Toyoda(National Institute of Genetics), Takehiko Itoh(Tokyo Institute of Technology)
Nature Communications
April 12, 2019
Cited by 144Open Access
Full Text

Abstract

The ultimate goal for diploid genome determination is to completely decode homologous chromosomes independently, and several phasing programs from consensus sequences have been developed. These methods work well for lowly heterozygous genomes, but the manifold species have high heterozygosity. Additionally, there are highly divergent regions (HDRs), where the haplotype sequences differ considerably. Because HDRs are likely to direct various interesting biological phenomena, many genomic analysis targets fall within these regions. However, they cannot be accessed by existing phasing methods, and we have to adopt costly traditional methods. Here, we develop a de novo haplotype assembler, Platanus-allee ( http://platanus.bio.titech.ac.jp/platanus2 ), which initially constructs each haplotype sequence and then untangles the assembly graphs utilizing sequence links and synteny information. A comprehensive benchmark analysis reveals that Platanus-allee exhibits high recall and precision, particularly for HDRs. Using this approach, previously unknown HDRs are detected in the human genome, which may uncover novel aspects of genome variability.


Related Papers

No related papers found

Powered by citation graph analysis