Progress and opportunities of foundation models in bioinformatics

Qing Li(Chinese University of Hong Kong), Zhihang Hu(Chinese University of Hong Kong), Yixuan Wang(Chinese University of Hong Kong), Lei Li(Chinese University of Hong Kong), Yimin Fan(Chinese University of Hong Kong), Irwin King(Chinese University of Hong Kong), Gengjie Jia(Agricultural Genomics Institute at Shenzhen), Sheng Wang(Shenzhen University), Le Song, Yu Li(Chinese University of Hong Kong)
Briefings in Bioinformatics
September 23, 2024
Cited by 49Open Access
Full Text

Abstract

Bioinformatics has undergone a paradigm shift in artificial intelligence (AI), particularly through foundation models (FMs), which address longstanding challenges in bioinformatics such as limited annotated data and data noise. These AI techniques have demonstrated remarkable efficacy across various downstream validation tasks, effectively representing diverse biological entities and heralding a new era in computational biology. The primary goal of this survey is to conduct a general investigation and summary of FMs in bioinformatics, tracing their evolutionary trajectory, current research landscape, and methodological frameworks. Our primary focus is on elucidating the application of FMs to specific biological problems, offering insights to guide the research community in choosing appropriate FMs for tasks like sequence analysis, structure prediction, and function annotation. Each section delves into the intricacies of the targeted challenges, contrasting the architectures and advancements of FMs with conventional methods and showcasing their utility across different biological domains. Further, this review scrutinizes the hurdles and constraints encountered by FMs in biology, including issues of data noise, model interpretability, and potential biases. This analysis provides a theoretical groundwork for understanding the circumstances under which certain FMs may exhibit suboptimal performance. Lastly, we outline prospective pathways and methodologies for the future development of FMs in biological research, facilitating ongoing innovation in the field. This comprehensive examination not only serves as an academic reference but also as a roadmap for forthcoming explorations and applications of FMs in biology.


Related Papers