Publishes on Gut microbiota and health, Effects of Environmental Stressors on Livestock, Genomics and Phylogenetic Studies. 24 papers and 13.1k citations.
To understand the impact of gut microbes on human health and well-being it is crucial to assess their genetic potential. Here we describe the Illumina-based metagenomic sequencing, assembly and characterization of 3.3 million non-redundant microbial genes, derived from 576.7 gigabases of sequence, from faecal samples of 124 European individuals. The gene set, ∼150 times larger than the human gene complement, contains an overwhelming majority of the prevalent (more frequent) microbial genes of the cohort and probably includes a large proportion of the prevalent human intestinal microbial genes. The genes are largely shared among individuals of the cohort. Over 99% of the genes are bacterial, indicating that the entire cohort harbours between 1,000 and 1,150 prevalent bacterial species and each individual at least 160 such species, which are also largely shared. We define and describe the minimal gut metagenome and the minimal gut bacterial genome in terms of functions present in all individuals and most bacteria, respectively. The human body plays host to an estimated 100 trillion microbial cells, most of them in the gut where they have a profound influence on human physiology and nutrition — and are now regarded as crucial for human life. Gut microbes contribute to the energy harvest from food, and changes of gut microbiome may be associated with bowel diseases or obesity. Now the international MetaHIT (Metagenomics of the Human Intestinal Tract) project has published a gene catalogue of the human gut microbiome derived from 124 healthy, overweight and obese human adults, as well as inflammatory disease patients, from Denmark and Spain. The resulting data provide the first insights into this gene set — which is over 150 times larger than the human gene complement — and show that the genes are largely shared among individuals. Based on the variety of functions encoded by the gene set, it is possible to define both a minimal gut metagenome and a minimal gut bacterial genome. Deep metagenomic sequencing and characterization of the human gut microbiome from healthy and obese individuals, as well as those suffering from inflammatory bowel disease, provide the first insights into this gene set and how much of it is shared among individuals. The minimal gut metagenome as well as the minimal gut bacterial genome is also described.
Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the Asian individual’s genome. We identified approximately 3 million single-nucleotide polymorphisms (SNPs) inside this region, of which 13.6% were not in the dbSNP database. Genotyping analysis showed that SNP identification had high accuracy and consistency, indicating the high sequence quality of this assembly. We also carried out heterozygote phasing and haplotype prediction against HapMap CHB and JPT haplotypes (Chinese and Japanese, respectively), sequence comparison with the two available individual genomes (J. D. Watson and J. C. Venter), and structural variation identification. These variations were considered for their potential biological impact. Our sequence data and analyses demonstrate the potential usefulness of next-generation sequencing technologies for personal genomics. The power of the latest massively parallel synthetic DNA sequencing technologies is demonstrated in two major collaborations that shed light on the nature of genomic variation with ethnicity. The first describes the genomic characterization of an individual from the Yoruba ethnic group of west Africa. The second reports a personal genome of a Han Chinese, the group comprising 30% of the world's population. These new resources can now be used in conjunction with the Venter, Watson and NIH reference sequences. A separate study looked at genetic ethnicity on the continental scale, based on data from 1,387 individuals from more than 30 European countries. Overall there was little genetic variation between countries, but the differences that do exist correspond closely to the geographic map. Statistical analysis of the genome data places 50% of the individuals within 310 km of their reported origin. As well as its relevance for testing genetic ancestry, this work has implications for evaluating genome-wide association studies that link genes with diseases.
Species in genus Nannochloropsis are promising candidates for both biofuel and biomass production due to their ability to accumulate rich fatty acids and grow fast; however, their sexual reproduction has not been studied. It is clear that the construction of their metabolic pathways, such as that of polyunsaturated fatty acid (PUFA) biosynthesis, and understanding of their biological characteristics, such as nuclear ploidy and reproductive strategy, will certainly facilitate their genetic improvement through gene engineering and mutation and clonal expansion. In this study, the genome of N. oceanica S. Suda et Miyashita was sequenced with the next-generation Illumina GA sequencing technologies. The genome was ∼30 Mb in size, which contained 11,129 protein-encoding genes. Of them, 59.65% were annotated by aligning with those in diverse protein databases, and 29.68% were assigned at least one function described in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Less frequent polymorphic nucleotides (one in 22.06 kb) and the obvious deviation from 1:1 (major:minor, minor ≥10) expectation indicated the nuclear monoploidy of N. oceanica. The lack of the majority of meiosis-specific proteins implied the asexual reproduction of this alga. In combination, the nuclear monoploidy and asexual propagation led us to favor the hypothesis that N. oceanica was a premeiotic or ameiotic alga. In addition, sequence similarity-based searching identified the elongase- and desaturase-encoding genes involved in the biosynthesis of long-chain PUFAs, which provided the genetic basis of its rich content of eicosapentaenoic acid (EPA). The functional genes and their metabolic pathways profiled against its genome sequence will facilitate its integrative investigations.