elements are often located in genic regions and preferentially insert nearby or within genes, indicating their impact on the evolution of genes and their potential as mutagenesis tools.
State Grid Corporation of China (China)
ORCID: 0000-0002-2776-6669Publishes on Chromosomal and Genetic Variations, Genomics and Phylogenetic Studies, Plant Disease Resistance and Genetics. 122 papers and 20.6k citations.
Add your photo, update your bio, and get notified when your ranking changes.
elements are often located in genic regions and preferentially insert nearby or within genes, indicating their impact on the evolution of genes and their potential as mutagenesis tools.
Abstract Background Sequencing technology and assembly algorithms have matured to the point that high-quality de novo assembly is possible for large, repetitive genomes. Current assemblies traverse transposable elements (TEs) and provide an opportunity for comprehensive annotation of TEs. Numerous methods exist for annotation of each class of TEs, but their relative performances have not been systematically compared. Moreover, a comprehensive pipeline is needed to produce a non-redundant library of TEs for species lacking this resource to generate whole-genome TE annotations. Results We benchmark existing programs based on a carefully curated library of rice TEs. We evaluate the performance of methods annotating long terminal repeat (LTR) retrotransposons, terminal inverted repeat (TIR) transposons, short TIR transposons known as miniature inverted transposable elements (MITEs), and Helitrons. Performance metrics include sensitivity, specificity, accuracy, precision, FDR, and F 1 . Using the most robust programs, we create a comprehensive pipeline called Extensive de-novo TE Annotator (EDTA) that produces a filtered non-redundant TE library for annotation of structurally intact and fragmented elements. EDTA also deconvolutes nested TE insertions frequently found in highly repetitive genomic regions. Using other model species with curated TE libraries (maize and Drosophila), EDTA is shown to be robust across both plant and animal species. Conclusions The benchmarking results and pipeline developed here will greatly facilitate TE annotation in eukaryotic genomes. These annotations will promote a much more in-depth understanding of the diversity and evolution of TEs at both intra- and inter-species levels. EDTA is open-source and freely available: https://github.com/oushujun/EDTA .
In the early 1990s an outbreak of papaya ringspot virus (PRSV) in the papaya groves in the Puna district of Hawaii caused severe damage to an important crop. Since then, the planting of two transgenic cultivars resistant to the virus — called 'SunUp' and 'Rainbow' — has helped to maintain yields. SunUp is a transgenic red-fleshed fruit that expresses the coat protein gene of a mild mutant of PRSV, conferring resistance via post-transcriptional gene silencing. Rainbow is a yellow-fleshed (and therefore more popular) F1 hybrid bred from SunUp. Now the draft genome sequence of the SunUp strain of papaya has been determined — a first for a commercial virus-resistant transgenic fruit tree. Comparison of this plant genome to those of Arabidopsis and others sheds light on the evolution of qualities such as biosynthesis, starch deposition, control of photosynthesis and pathways for creating the volatile compounds that contribute to the characteristic flavour of papaya. On the cover, the disease-free transgenic Rainbow and the severely infected, stunted and dying non-transgenic Sunrise grow in adjoining plots. Researchers from Hawaii and an international consortium have produced a draft genome assembly for 'SunUp', the first commercial virus-resistant transgenic fruit tree. Comparison of this plant genome to those of Arabidopsis and others sheds light on evolution of characteristics such as biosynthesis, starch deposition, control of photosynthesis and pathways for creating volatile compounds. Papaya, a fruit crop cultivated in tropical and subtropical regions, is known for its nutritional benefits and medicinal applications. Here we report a 3× draft genome sequence of ‘SunUp’ papaya, the first commercial virus-resistant transgenic fruit tree1 to be sequenced. The papaya genome is three times the size of the Arabidopsis genome, but contains fewer genes, including significantly fewer disease-resistance gene analogues. Comparison of the five sequenced genomes suggests a minimal angiosperm gene set of 13,311. A lack of recent genome duplication, atypical of other angiosperm genomes sequenced so far2,3,4,5, may account for the smaller papaya gene number in most functional groups. Nonetheless, striking amplifications in gene number within particular functional groups suggest roles in the evolution of tree-like habit, deposition and remobilization of starch reserves, attraction of seed dispersal agents, and adaptation to tropical daylengths. Transgenesis at three locations is closely associated with chloroplast insertions into the nuclear genome, and with topoisomerase I recognition sites. Papaya offers numerous advantages as a system for fruit-tree functional genomics, and this draft genome sequence provides the foundation for revealing the basis of Carica’s distinguishing morpho-physiological, medicinal and nutritional properties.
Assembling a plant genome is challenging due to the abundance of repetitive sequences, yet no standard is available to evaluate the assembly of repeat space. LTR retrotransposons (LTR-RTs) are the predominant interspersed repeat that is poorly assembled in draft genomes. Here, we propose a reference-free genome metric called LTR Assembly Index (LAI) that evaluates assembly continuity using LTR-RTs. After correcting for LTR-RT amplification dynamics, we show that LAI is independent of genome size, genomic LTR-RT content, and gene space evaluation metrics (i.e., BUSCO and CEGMA). By comparing genomic sequences produced by various sequencing techniques, we reveal the significant gain of assembly continuity by using long-read-based techniques over short-read-based methods. Moreover, LAI can facilitate iterative assembly improvement with assembler selection and identify low-quality genomic regions. To apply LAI, intact LTR-RTs and total LTR-RTs should contribute at least 0.1% and 5% to the genome size, respectively. The LAI program is freely available on GitHub: https://github.com/oushujun/LTR_retriever.