agriGO: a GO analysis toolkit for the agricultural communityZhou Du, Xin Zhou, Yi Ling et al.|Nucleic Acids Research|2010 Gene Ontology (GO), the de facto standard in gene functionality description, is used widely in functional annotation and enrichment analysis. Here, we introduce agriGO, an integrated web-based GO analysis toolkit for the agricultural community, using the advantages of our previous GO enrichment tool (EasyGO), to meet analysis demands from new technologies and research objectives. EasyGO is valuable for its proficiency, and has proved useful in uncovering biological knowledge in massive data sets from high-throughput experiments. For agriGO, the system architecture and website interface were redesigned to improve performance and accessibility. The supported organisms and gene identifiers were substantially expanded (including 38 agricultural species composed of 274 data types). The requirement on user input is more flexible, in that user-defined reference and annotation are accepted. Moreover, a new analysis approach using Gene Set Enrichment Analysis strategy and customizable features is provided. Four tools, SEA (Singular enrichment analysis), PAGE (Parametric Analysis of Gene set Enrichment), BLAST4ID (Transfer IDs by BLAST) and SEACOMPARE (Cross comparison of SEA), are integrated as a toolkit to meet different demands. We also provide a cross-comparison service so that different data sets can be compared and explored in a visualized way. Lastly, agriGO functions as a GO data repository with search and download functions; agriGO is publicly accessible at http://bioinfo.cau.edu.cn/agriGO/.
SOAPdenovo-Trans: <i>de novo</i> transcriptome assembly with short RNA-Seq readsMOTIVATION: Transcriptome sequencing has long been the favored method for quickly and inexpensively obtaining a large number of gene sequences from an organism with no reference genome. Owing to the rapid increase in throughputs and decrease in costs of next- generation sequencing, RNA-Seq in particular has become the method of choice. However, the very short reads (e.g. 2 × 90 bp paired ends) from next generation sequencing makes de novo assembly to recover complete or full-length transcript sequences an algorithmic challenge. RESULTS: Here, we present SOAPdenovo-Trans, a de novo transcriptome assembler designed specifically for RNA-Seq. We evaluated its performance on transcriptome datasets from rice and mouse. Using as our benchmarks the known transcripts from these well-annotated genomes (sequenced a decade ago), we assessed how SOAPdenovo-Trans and two other popular transcriptome assemblers handled such practical issues as alternative splicing and variable expression levels. Our conclusion is that SOAPdenovo-Trans provides higher contiguity, lower redundancy and faster execution. AVAILABILITY AND IMPLEMENTATION: Source code and user manual are available at http://sourceforge.net/projects/soapdenovotrans/.
Evolutionary History of the HymenopteraThe evolution and genomic basis of beetle diversityDuane D. McKenna, Seunggwan Shin, Dirk Ahrens et al.|Proceedings of the National Academy of Sciences|2019 The order Coleoptera (beetles) is arguably the most speciose group of animals, but the evolutionary history of beetles, including the impacts of plant feeding (herbivory) on beetle diversification, remain poorly understood. We inferred the phylogeny of beetles using 4,818 genes for 146 species, estimated timing and rates of beetle diversification using 89 genes for 521 species representing all major lineages and traced the evolution of beetle genes enabling symbiont-independent digestion of lignocellulose using 154 genomes or transcriptomes. Phylogenomic analyses of these uniquely comprehensive datasets resolved previously controversial beetle relationships, dated the origin of Coleoptera to the Carboniferous, and supported the codiversification of beetles and angiosperms. Moreover, plant cell wall-degrading enzymes (PCWDEs) obtained from bacteria and fungi via horizontal gene transfers may have been key to the Mesozoic diversification of herbivorous beetles-remarkably, both major independent origins of specialized herbivory in beetles coincide with the first appearances of an arsenal of PCWDEs encoded in their genomes. Furthermore, corresponding (Jurassic) diversification rate increases suggest that these novel genes triggered adaptive radiations that resulted in nearly half of all living beetle species. We propose that PCWDEs enabled efficient digestion of plant tissues, including lignocellulose in cell walls, facilitating the evolution of uniquely specialized plant-feeding habits, such as leaf mining and stem and wood boring. Beetle diversity thus appears to have resulted from multiple factors, including low extinction rates over a long evolutionary history, codiversification with angiosperms, and adaptive radiations of specialized herbivorous beetles following convergent horizontal transfers of microbial genes encoding PCWDEs.
Environmental Barcoding: A Next-Generation Sequencing Approach for Biomonitoring Applications Using River BenthosTimely and accurate biodiversity analysis poses an ongoing challenge for the success of biomonitoring programs. Morphology-based identification of bioindicator taxa is time consuming, and rarely supports species-level resolution especially for immature life stages. Much work has been done in the past decade to develop alternative approaches for biodiversity analysis using DNA sequence-based approaches such as molecular phylogenetics and DNA barcoding. On-going assembly of DNA barcode reference libraries will provide the basis for a DNA-based identification system. The use of recently introduced next-generation sequencing (NGS) approaches in biodiversity science has the potential to further extend the application of DNA information for routine biomonitoring applications to an unprecedented scale. Here we demonstrate the feasibility of using 454 massively parallel pyrosequencing for species-level analysis of freshwater benthic macroinvertebrate taxa commonly used for biomonitoring. We designed our experiments in order to directly compare morphology-based, Sanger sequencing DNA barcoding, and next-generation environmental barcoding approaches. Our results show the ability of 454 pyrosequencing of mini-barcodes to accurately identify all species with more than 1% abundance in the pooled mixture. Although the approach failed to identify 6 rare species in the mixture, the presence of sequences from 9 species that were not represented by individuals in the mixture provides evidence that DNA based analysis may yet provide a valuable approach in finding rare species in bulk environmental samples. We further demonstrate the application of the environmental barcoding approach by comparing benthic macroinvertebrates from an urban region to those obtained from a conservation area. Although considerable effort will be required to robustly optimize NGS tools to identify species from bulk environmental samples, our results indicate the potential of an environmental barcoding approach for biomonitoring programs.