H

Haidong Yan

Anhui Agricultural University

ORCID: 0000-0002-9903-2672

Publishes on Bioenergy crop production and management, Genomics and Phylogenetic Studies, Plant Stress Responses and Tolerance. 108 papers and 2.5k citations.

108Publications
2.5kTotal Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

DeepTE: a computational method for <i>de novo</i> classification of transposons with convolutional neural network
Cited by 206Open Access

MOTIVATION: Transposable elements (TEs) classification is an essential step to decode their roles in genome evolution. With a large number of genomes from non-model species becoming available, accurate and efficient TE classification has emerged as a new challenge in genomic sequence analysis. RESULTS: We developed a novel tool, DeepTE, which classifies unknown TEs using convolutional neural networks (CNNs). DeepTE transferred sequences into input vectors based on k-mer counts. A tree structured classification process was used where eight models were trained to classify TEs into super families and orders. DeepTE also detected domains inside TEs to correct false classification. An additional model was trained to distinguish between non-TEs and TEs in plants. Given unclassified TEs of different species, DeepTE can classify TEs into seven orders, which include 15, 24 and 16 super families in plants, metazoans and fungi, respectively. In several benchmarking tests, DeepTE outperformed other existing tools for TE classification. In conclusion, DeepTE successfully leverages CNN for TE classification, and can be used to precisely classify TEs in newly sequenced eukaryotic genomes. AVAILABILITY AND IMPLEMENTATION: DeepTE is accessible at https://github.com/LiLabAtVT/DeepTE. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Pangenomic analysis identifies structural variation associated with heat tolerance in pearl millet
Haidong Yan, Min Sun, Zhongren Zhang et al.|Nature Genetics|2023
Cited by 178Open Access

Pearl millet is an important cereal crop worldwide and shows superior heat tolerance. Here, we developed a graph-based pan-genome by assembling ten chromosomal genomes with one existing assembly adapted to different climates worldwide and captured 424,085 genomic structural variations (SVs). Comparative genomics and transcriptomics analyses revealed the expansion of the RWP-RK transcription factor family and the involvement of endoplasmic reticulum (ER)-related genes in heat tolerance. The overexpression of one RWP-RK gene led to enhanced plant heat tolerance and transactivated ER-related genes quickly, supporting the important roles of RWP-RK transcription factors and ER system in heat tolerance. Furthermore, we found that some SVs affected the gene expression associated with heat tolerance and SVs surrounding ER-related genes shaped adaptation to heat tolerance during domestication in the population. Our study provides a comprehensive genomic resource revealing insights into heat tolerance and laying a foundation for generating more robust crops under the changing climate.

Transcriptome analysis of heat stress and drought stress in pearl millet based on Pacbio full-length transcriptome sequencing
Min Sun, Dejun Huang, Ailing Zhang et al.|BMC Plant Biology|2020
Cited by 144Open Access

BACKGROUND: Heat and drought are serious threats for crop growth and development. As the sixth largest cereal crop in the world, pearl millet can not only be used for food and forage but also as a source of bioenergy. Pearl millet is highly tolerant to heat and drought. Given this, it is considered an ideal crop to study plant stress tolerance and can be used to identify heat-resistant genes. RESULTS: In this study, we used Pacbio sequencing data as a reference sequence to analyze the Illumina data of pearl millet that had been subjected to heat and drought stress for 48 h. By summarizing previous studies, we found 26,299 new genes and 63,090 new transcripts, and the number of gene annotations increased by 20.18%. We identified 2792 transcription factors and 1223 transcriptional regulators. There were 318 TFs and 149 TRs differentially expressed under heat stress, and 315 TFs and 128 TRs were differentially expressed under drought stress. We used RNA sequencing to identify 6920 genes and 6484 genes differentially expressed under heat stress and drought stress, respectively. CONCLUSIONS: Through Pacbio sequencing, we have identified more new genes and new transcripts. On the other hand, comparing the differentially expressed genes under heat tolerance with the DEGs under drought stress, we found that even in the same pathway, pearl millet responds with a different protein.

Identification of Candidate Reference Genes in Perennial Ryegrass for Quantitative RT-PCR under Various Abiotic Stress Conditions
Linkai Huang, Haidong Yan, Xiaomei Jiang et al.|PLoS ONE|2014
Cited by 104Open Access

BACKGROUND: Quantitative real-time reverse-transcriptase PCR (qRT-PCR) is an important technique for analyzing differences in gene expression due to its sensitivity, accuracy and specificity. However, the stability of the expression of reference genes is necessary to ensure accurate qRT-PCR assessment of expression in genes of interest. Perennial ryegrass (Lolium perenne L.) is important forage and turf grass species in temperate regions, but the expression stability of its reference genes under various stresses has not been well-studied. METHODOLOGY/PRINCIPAL FINDINGS: In this study, 11 candidate reference genes were evaluated for use as controls in qRT-PCR to quantify gene expression in perennial ryegrass under drought, high salinity, heat, waterlogging, and ABA (abscisic acid) treatments. Four approaches--Delta CT, geNorm, BestKeeper and Normfinder were used to determine the stability of expression in these reference genes. The results are consistent with the idea that the best reference genes depend on the stress treatment under investigation. Eukaryotic initiation factor 4 alpha (eIF4A), Transcription elongation factor 1 (TEF1) and Tat binding protein-1 (TBP-1) were the three most stably expressed genes under drought stress and were also the three best genes for studying salt stress. eIF4A, TBP-1, and Ubiquitin-conjugating enzyme (E2) were the most suitable reference genes to study heat stress, while eIF4A, TEF1, and E2 were the three best reference genes for studying the effects of ABA. Finally, Ubiquitin (UBQ), TEF1, and eIF4A were the three best reference genes for waterlogging treatments. CONCLUSIONS/SIGNIFICANCE: These results will be helpful in choosing the best reference genes for use in studies related to various abiotic stresses in perennial ryegrass. The stability of expression in these reference genes will enable better normalization and quantification of the transcript levels for studies of gene expression in such studies.

Transposon activation is a major driver in the genome evolution of cultivated olive trees ( <i>Olea europaea</i> L.)
Cited by 99Open Access

The primary domestication of olive (Olea europaea L.) in the Levant dates back to the Neolithic period, around 6,000-5,500 BC, as some archeological remains attest. Cultivated olive trees are reproduced clonally, with sexual crosses being the sporadic events that drive the development of new varieties. In order to determine the genomic changes which have occurred in a modern olive cultivar, the genome of the Picual cultivar, one of the most popular olive varieties, was sequenced. Additional 40 cultivated and 10 wild accessions were re-sequenced to elucidate the evolution of the olive genome during the domestication process. It was found that the genome of the 'Picual' cultivar contains 79,667 gene models, of which 78,079 were protein-coding genes and 1,588 were tRNA. Population analyses support two independent events in olive domestication, including an early possible genetic bottleneck. Despite genetic bottlenecks, cultivated accessions showed a high genetic diversity driven by the activation of transposable elements (TE). A high TE gene expression was observed in presently cultivated olives, which suggests a current activity of TEs in domesticated olives. Several TEs families were expanded in the last 5,000 or 6,000 years and produced insertions near genes that may have been involved in selected traits during domestication as reproduction, photosynthesis, seed development, and oil production. Therefore, a great genetic variability has been found in cultivated olive as a result of a significant activation of TEs during the domestication process.