Kurume University
ORCID: 0000-0002-4509-2653Publishes on Escherichia coli research studies, Genomics and Phylogenetic Studies, Bacteriophages and microbial interactions. 299 papers and 12.1k citations.
Add your photo, update your bio, and get notified when your ranking changes.
Although many de novo genome assembly projects have recently been conducted using high-throughput sequencers, assembling highly heterozygous diploid genomes is a substantial challenge due to the increased complexity of the de Bruijn graph structure predominantly used. To address the increasing demand for sequencing of nonmodel and/or wild-type samples, in most cases inbred lines or fosmid-based hierarchical sequencing methods are used to overcome such problems. However, these methods are costly and time consuming, forfeiting the advantages of massive parallel sequencing. Here, we describe a novel de novo assembler, Platanus, that can effectively manage high-throughput data from heterozygous samples. Platanus assembles DNA fragments (reads) into contigs by constructing de Bruijn graphs with automatically optimized k-mer sizes followed by the scaffolding of contigs based on paired-end information. The complicated graph structures that result from the heterozygosity are simplified during not only the contig assembly step but also the scaffolding step. We evaluated the assembly results on eukaryotic samples with various levels of heterozygosity. Compared with other assemblers, Platanus yields assembly results that have a larger scaffold NG50 length without any accompanying loss of accuracy in both simulated and real data. In addition, Platanus recorded the largest scaffold NG50 values for two of the three low-heterozygosity species used in the de novo assembly contest, Assemblathon 2. Platanus therefore provides a novel and efficient approach for the assembly of gigabase-sized highly heterozygous genomes and is an attractive alternative to the existing assemblers designed for genomes of lower heterozygosity.
Numerous microbes inhabit the human intestine, many of which are uncharacterized or uncultivable. They form a complex microbial community that deeply affects human physiology. To identify the genomic features common to all human gut microbiomes as well as those variable among them, we performed a large-scale comparative metagenomic analysis of fecal samples from 13 healthy individuals of various ages, including unweaned infants. We found that, while the gut microbiota from unweaned infants were simple and showed a high inter-individual variation in taxonomic and gene composition, those from adults and weaned children were more complex but showed a high functional uniformity regardless of age or sex. In searching for the genes over-represented in gut microbiomes, we identified 237 gene families commonly enriched in adult-type and 136 families in infant-type microbiomes, with a small overlap. An analysis of their predicted functions revealed various strategies employed by each type of microbiota to adapt to its intestinal environment, suggesting that these gene sets encode the core functions of adult and infant-type gut microbiota. By analysing the orphan genes, 647 new gene families were identified to be exclusively present in human intestinal microbiomes. In addition, we discovered a conjugative transposon family explosively amplified in human gut microbiomes, which strongly suggests that the intestine is a 'hot spot' for horizontal gene transfer between microbes.
Among the various pathogenic Escherichia coli strains, enterohemorrhagic E. coli (EHEC) is the most devastating. Although serotype O157:H7 strains are the most prevalent, strains of different serotypes also possess similar pathogenic potential. Here, we present the results of a genomic comparison between EHECs of serotype O157, O26, O111, and O103, as well as 21 other, fully sequenced E. coli/Shigella strains. All EHECs have much larger genomes (5.5-5.9 Mb) than the other strains and contain surprisingly large numbers of prophages and integrative elements (IEs). The gene contents of the 4 EHECs do not follow the phylogenetic relationships of the strains, and they share virulence genes for Shiga toxins and many other factors. We found many lambdoid phages, IEs, and virulence plasmids that carry the same or similar virulence genes but have distinct evolutionary histories, indicating that independent acquisition of these mobile genetic elements has driven the evolution of each EHEC. Particularly interesting is the evolution of the type III secretion system (T3SS). We found that the T3SS of EHECs is composed of genes that were introduced by 3 different types of genetic elements: an IE referred to as the locus of enterocyte effacement, which encodes a central part of the T3SS; SpLE3-like IEs; and lambdoid phages carrying numerous T3SS effector genes and other T3SS-related genes. Our data demonstrate how E. coli strains of different phylogenies can independently evolve into EHECs, providing unique insights into the mechanisms underlying the parallel evolution of complex virulence systems in bacteria.