Y

Yi Han

Nanchang University

ORCID: 0000-0002-2835-9280

Publishes on Cancer Genomics and Diagnostics, Epigenetics and DNA Methylation, Influenza Virus Research Studies. 63 papers and 10.5k citations.

63Publications
10.5kTotal Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

MITE-Hunter: a program for discovering miniature inverted-repeat transposable elements from genomic sequences
Yi Han, S. R. Wessler|Nucleic Acids Research|2010
Cited by 696Open Access

Miniature inverted-repeat transposable elements (MITEs) are a special type of Class 2 non-autonomous transposable element (TE) that are abundant in the non-coding regions of the genes of many plant and animal species. The accurate identification of MITEs has been a challenge for existing programs because they lack coding sequences and, as such, evolve very rapidly. Because of their importance to gene and genome evolution, we developed MITE-Hunter, a program pipeline that can identify MITEs as well as other small Class 2 non-autonomous TEs from genomic DNA data sets. The output of MITE-Hunter is composed of consensus TE sequences grouped into families that can be used as a library file for homology-based TE detection programs such as RepeatMasker. MITE-Hunter was evaluated by searching the rice genomic database and comparing the output with known rice TEs. It discovered most of the previously reported rice MITEs (97.6%), and found sixteen new elements. MITE-Hunter was also compared with two other MITE discovery programs, FINDMITE and MUST. Unlike MITE-Hunter, neither of these programs can search large genomic data sets including whole genome sequences. More importantly, MITE-Hunter is significantly more accurate than either FINDMITE or MUST as the vast majority of their outputs are false-positives.

DriverML: a machine learning algorithm for identifying driver genes in cancer sequencing studies
Yi Han, Juze Yang, Xinyi Qian et al.|Nucleic Acids Research|2019
Cited by 152Open Access

Although rapid progress has been made in computational approaches for prioritizing cancer driver genes, research is far from achieving the ultimate goal of discovering a complete catalog of genes truly associated with cancer. Driver gene lists predicted from these computational tools lack consistency and are prone to false positives. Here, we developed an approach (DriverML) integrating Rao's score test and supervised machine learning to identify cancer driver genes. The weight parameters in the score statistics quantified the functional impacts of mutations on the protein. To obtain optimized weight parameters, the score statistics of prior driver genes were maximized on pan-cancer training data. We conducted rigorous and unbiased benchmark analysis and comparisons of DriverML with 20 other existing tools in 31 independent datasets from The Cancer Genome Atlas (TCGA). Our comprehensive evaluations demonstrated that DriverML was robust and powerful among various datasets and outperformed the other tools with a better balance of precision and sensitivity. In vitro cell-based assays further proved the validity of the DriverML prediction of novel driver genes. In summary, DriverML uses an innovative, machine learning-based approach to prioritize cancer driver genes and provides dramatic improvements over currently existing methods. Its source code is available at https://github.com/HelloYiHan/DriverML.

Insights into the Structure and Regulation of Glucokinase from a Novel Mutation (V62M), Which Causes Maturity-onset Diabetes of the Young
Anna L. Gloyn, Stella Odili, Dorothy Zelent et al.|Journal of Biological Chemistry|2005
Cited by 108Open Access

Glucokinase (GCK) serves as the pancreatic glucose sensor. Heterozygous inactivating GCK mutations cause hyperglycemia, whereas activating mutations cause hypoglycemia. We studied the GCK V62M mutation identified in two families and co-segregating with hyperglycemia to understand how this mutation resulted in reduced function. Structural modeling locates the mutation close to five naturally occurring activating mutations in the allosteric activator site of the enzyme. Recombinant glutathionyl S-transferase-V62M GCK is paradoxically activated rather than inactivated due to a decreased S0.5 for glucose compared with wild type (4.88 versus 7.55 mM). The recently described pharmacological activator (RO0281675) interacts with GCK at this site. V62M GCK does not respond to RO0281675, nor does it respond to the hepatic glucokinase regulatory protein (GKRP). The enzyme is also thermally unstable, but this lability is apparently less pronounced than in the proven instability mutant E300K. Functional and structural analysis of seven amino acid substitutions at residue Val62 has identified a non-linear relationship between activation by the pharmacological activator and the van der Waals interactions energies. Smaller energies allow a hydrophobic interaction between the activator and glucokinase, whereas larger energies prohibit the ligand from fitting into the binding pocket. We conclude that V62M may cause hyperglycemia by a complex defect of GCK regulation involving instability in combination with loss of control by a putative endogenous activator and/or GKRP. This study illustrates that mutations that cause hyperglycemia are not necessarily kinetically inactivating but may exert their effects by other complex mechanisms. Elucidating such mechanisms leads to a deeper understanding of the GCK glucose sensor and the biochemistry of beta-cells and hepatocytes.

T-Cell Receptor Repertoire Sequencing in the Era of Cancer Immunotherapy
Meredith Frank, Kaylene Lu, Can Erdogan et al.|Clinical Cancer Research|2022
Cited by 70Open Access

T cells are integral components of the adaptive immune system, and their responses are mediated by unique T-cell receptors (TCR) that recognize specific antigens from a variety of biological contexts. As a result, analyzing the T-cell repertoire offers a better understanding of immune responses and of diseases like cancer. Next-generation sequencing technologies have greatly enabled the high-throughput analysis of the TCR repertoire. On the basis of our extensive experience in the field from the past decade, we provide an overview of TCR sequencing, from the initial library preparation steps to sequencing and analysis methods and finally to functional validation techniques. With regards to data analysis, we detail important TCR repertoire metrics and present several computational tools for predicting antigen specificity. Finally, we highlight important applications of TCR sequencing and repertoire analysis to understanding tumor biology and developing cancer immunotherapies.

A DNA-binding activity, TRAC, specific for the TRA element of the transferrin receptor gene copurifies with the Ku autoantigen.
Michelle Roberts, Yi Han, Allen A. Fienberg et al.|Proceedings of the National Academy of Sciences|1994
Cited by 46Open Access

We have previously described purification and characterization of a nuclear protein, TREF, which interacts specifically with the transcriptional control element, TRA, of the human transferrin receptor (TR) gene. In this report we show that TREF can be separated into two functionally distinct DNA-binding activities. The first DNA-binding activity (TRAC) is highly specific for the 8-bp element TRA and the related Escherichia coli cAMP receptor binding site. This motif is homologous to the phorbol 12-tetradecanoate 13-acetate- and cAMP-responsive elements of eukaryotic genes and the regulatory proximal sequence elements of the U1 small nuclear RNA gene and is also present in the promoter of the Drosophila melanogaster yolk protein factor 1 gene. In striking contrast, the second activity exhibits high affinity for the ends of double-stranded DNA in a sequence-unspecific manner and is attributable to the heterodimeric Ku autoantigen. Notably, transcription of Ku is induced during mid-late G0/G1 with kinetics similar to the TR gene. Ku is a highly abundant nuclear protein possessing nonspecific affinity for the ends of DNA, whose biological role remains to be elucidated. A transcriptional role for this protein has been proposed, however, on the basis of studies attributing DNA sequence-specific binding activity, notably for TRA-like sequences described above, directly to the Ku heterodimer. The observation that Ku-mediated nonspecific DNA-binding activity copurifies with the TRA-specific activity, TRAC, clearly has implications for these and related studies. The unusual properties of TRAC activity and its relationship, if any, with the enigmatic Ku protein, are discussed.