University of Science and Technology of China
ORCID: 0009-0001-2881-962XPublishes on Bioinformatics and Genomic Networks, Gene Regulatory Network Analysis, Microbial Metabolic Engineering and Bioproduction. 22 papers and 322 citations.
Add your photo, update your bio, and get notified when your ranking changes.
Bioinformatics has undergone a paradigm shift in artificial intelligence (AI), particularly through foundation models (FMs), which address longstanding challenges in bioinformatics such as limited annotated data and data noise. These AI techniques have demonstrated remarkable efficacy across various downstream validation tasks, effectively representing diverse biological entities and heralding a new era in computational biology. The primary goal of this survey is to conduct a general investigation and summary of FMs in bioinformatics, tracing their evolutionary trajectory, current research landscape, and methodological frameworks. Our primary focus is on elucidating the application of FMs to specific biological problems, offering insights to guide the research community in choosing appropriate FMs for tasks like sequence analysis, structure prediction, and function annotation. Each section delves into the intricacies of the targeted challenges, contrasting the architectures and advancements of FMs with conventional methods and showcasing their utility across different biological domains. Further, this review scrutinizes the hurdles and constraints encountered by FMs in biology, including issues of data noise, model interpretability, and potential biases. This analysis provides a theoretical groundwork for understanding the circumstances under which certain FMs may exhibit suboptimal performance. Lastly, we outline prospective pathways and methodologies for the future development of FMs in biological research, facilitating ongoing innovation in the field. This comprehensive examination not only serves as an academic reference but also as a roadmap for forthcoming explorations and applications of FMs in biology.
Findings from a recent study of the largest documented cohort of individuals with Down syndrome (DS) in the United States described prevalence of common disease conditions and strongly suggested significant disparity in mental health conditions among these individuals as compared with age- and sex-matched individuals without DS. The retrospective, descriptive study reported herein is a follow-up to document prevalence of 58 mental health conditions across 28 years of data from 6078 individuals with DS and 30,326 age- and sex-matched controls. Patient data were abstracted from electronic medical records within a large integrated health system. In general, individuals with DS had higher prevalence of mood disorders (including depression); anxiety disorders (including obsessive-compulsive disorder); schizophrenia; psychosis (including hallucinations); pseudobulbar affect; personality disorder; dementia (including Alzheimer's disease); mental disorder due to physiologic causes; conduct disorder; tic disorder; and impulse control disorder. Conversely, the DS cohort experienced lower prevalence of bipolar I disorder; generalized anxiety, panic, phobic, and posttraumatic stress disorders; substance use disorders (including alcohol, opioid, cannabis, cocaine, and nicotine disorders); and attention-deficit/hyperactivity disorder. Prevalence of many mental health conditions in the setting of DS vastly differs from comparable individuals without DS. These findings delineate a heretofore unclear jumping-off point for ongoing research.
Kinetic modeling of metabolic pathways has important applications in metabolic engineering, but significant challenges still remain. The difficulties faced vary from finding best-fit parameters in a highly multidimensional search space to incomplete parameter identifiability. To meet some of these challenges, an ensemble modeling method is developed for characterizing a subset of kinetic parameters that give statistically equivalent goodness-of-fit to time series concentration data. The method is based on the incremental identification approach, where the parameter estimation is done in a step-wise manner. Numerical efficacy is achieved by reducing the dimensionality of parameter space and using efficient random parameter exploration algorithms. The shift toward using model ensembles, instead of the traditional "best-fit" models, is necessary to directly account for model uncertainty during the application of such models. The performance of the ensemble modeling approach has been demonstrated in the modeling of a generic branched pathway and the trehalose pathway in Saccharomyces cerevisiae using generalized mass action (GMA) kinetics.
Asthma is a heterogeneous, complex syndrome, and identifying asthma endotypes has been challenging. We hypothesize that distinct endotypes of asthma arise in disparate genetic variation and life-time environmental exposure backgrounds, and that disease comorbidity patterns serve as a surrogate for such genetic and exposure variations. Here, we computationally discover 22 distinct comorbid disease patterns among individuals with asthma (asthma comorbidity subgroups) using diagnosis records for >151 M US residents, and re-identify 11 of the 22 subgroups in the much smaller UK Biobank. GWASs to discern asthma risk loci for individuals within each subgroup and in all subgroups combined reveal 109 independent risk loci, of which 52 are replicated in multi-ancestry meta-analysis across different ethnicity subsamples in UK Biobank, US BioVU, and BioBank Japan. Fourteen loci confer asthma risk in multiple subgroups and in all subgroups combined. Importantly, another six loci confer asthma risk in only one subgroup. The strength of association between asthma and each of 44 health-related phenotypes also varies dramatically across subgroups. This work reveals subpopulations of asthma patients distinguished by comorbidity patterns, asthma risk loci, gene expression, and health-related phenotypes, and so reveals different asthma endotypes.
Coming soon — researchers in similar fields and career stages