GeneCompass: deciphering universal gene regulatory mechanisms with a knowledge-informed cross-species foundation model

Xiaodong Yang(Chinese Academy of Sciences), Guole Liu(Chinese Academy of Sciences), Guihai Feng(Chinese Academy of Sciences), Dechao Bu(Chinese Academy of Sciences), Pengfei Wang(Chinese Academy of Sciences), Jie Jiang(Chinese Academy of Sciences), Shubai Chen(Chinese Academy of Sciences), Qinmeng Yang(Chinese Academy of Sciences), Hefan Miao(Chinese Academy of Sciences), Yiyang Zhang(Chinese Academy of Sciences), Zhenpeng Man(Chinese Academy of Sciences), Zhongming Liang(Chinese Academy of Sciences), Zichen Wang(Chinese Academy of Sciences), Yaning Li(Chinese Academy of Sciences), Zheng Li(Chinese Academy of Sciences), Yana Liu(Chinese Academy of Sciences), Yao Tian(Chinese Academy of Sciences), Wenhao Liu(Chinese Academy of Sciences), Cong Li(Chinese Academy of Sciences), Ao Li(Chinese Academy of Sciences), Jingxi Dong(Chinese Academy of Sciences), Zhilong Hu(Chinese Academy of Sciences), Fang Chen(Chinese Academy of Sciences), Lina Cui(Chinese Academy of Sciences), Zixu Deng(Chinese Academy of Sciences), Haiping Jiang(Chinese Academy of Sciences), Wentao Cui(Chinese Academy of Sciences), Jiahao Zhang(Chinese Academy of Sciences), Zhaohui Yang(Chinese Academy of Sciences), Handong Li(Chinese Academy of Sciences), Xingjian He(Chinese Academy of Sciences), Liqun Zhong(Chinese Academy of Sciences), Jiaheng Zhou(Chinese Academy of Sciences), Zijian Wang(Chinese Academy of Sciences), Qingqing Long(Chinese Academy of Sciences), Ping Xu(Chinese Academy of Sciences), Xin Li(Institute for Stem Cell Biology and Regenerative Medicine), Hongmei Wang(Chinese Academy of Sciences), Baoyang Hu(Chinese Academy of Sciences), Wei Li(Chinese Academy of Sciences), Fei Gao(Chinese Academy of Sciences), Jingtao Guo(Chinese Academy of Sciences), Leqian Yu(Chinese Academy of Sciences), Qi Gu(Chinese Academy of Sciences), Weiwei Zhai(Chinese Academy of Sciences), Zhengting Zou(Chinese Academy of Sciences), Guihai Feng(Chinese Academy of Sciences), Wenhao Liu(Chinese Academy of Sciences), Yao Tian(Chinese Academy of Sciences), Fang Chen(Chinese Academy of Sciences), Jingxi Dong(Chinese Academy of Sciences), Yana Liu(Chinese Academy of Sciences), Jingqi Yu(Chinese Academy of Sciences), Wenhui Wu(Chinese Academy of Sciences), Xinxin Lin(Chinese Academy of Sciences), Cong Li(Chinese Academy of Sciences), Yu Zou(Chinese Academy of Sciences), Yongshun Ren(Chinese Academy of Sciences), Fan Li(Chinese Academy of Sciences), Yixiao Zhao(Chinese Academy of Sciences), Yike Xin(Chinese Academy of Sciences), Longfei Han(Chinese Academy of Sciences), Shuyang Jiang(Chinese Academy of Sciences), Kai Ma(Chinese Academy of Sciences), Qicheng Chen(Chinese Academy of Sciences), Haoyuan Wang(Chinese Academy of Sciences), Huanhuan Wu(Chinese Academy of Sciences), Chaofan He(Chinese Academy of Sciences), Yilong Hu(Chinese Academy of Sciences), Shuyu Guo(Chinese Academy of Sciences), Yiyun Li(Chinese Academy of Sciences), Yuanchun Zhou(Chinese Academy of Sciences), Yangang Wang(Chinese Academy of Sciences), Xuezhi Wang(Chinese Academy of Sciences), Pengfei Wang(Chinese Academy of Sciences), Fei Li(Chinese Academy of Sciences), Zhen Meng(Chinese Academy of Sciences), Zaitian Wang(Chinese Academy of Sciences), Ping Xu(Chinese Academy of Sciences), Wentao Cui(Chinese Academy of Sciences), Zhilong Hu(Chinese Academy of Sciences), Huimin He(Chinese Academy of Sciences), Shan Zong(Chinese Academy of Sciences), Jiajia Wang(Chinese Academy of Sciences), Yan Chen(Chinese Academy of Sciences), Chunyang Zhang(Chinese Academy of Sciences), Chengrui Wang(Chinese Academy of Sciences), Ran Zhang(Chinese Academy of Sciences), Meng Xiao(Chinese Academy of Sciences), Yining Wang(Chinese Academy of Sciences), Yiqiang Chen(Institute of Computing Technology), Yi Zhao(Chinese Academy of Sciences), Xiaodong Yang(Chinese Academy of Sciences), Dechao Bu(Chinese Academy of Sciences), Xin Qin(Chinese Academy of Sciences), Jiaxin Qin(Chinese Academy of Sciences), Zhaohui Yang(Chinese Academy of Sciences), Chenhao Li(Chinese Academy of Sciences), Zhufeng Xu(Chinese Academy of Sciences), Zeyuan Zhang(Chinese Academy of Sciences), Xiaoning Qi(Chinese Academy of Sciences), Shubai Chen(Chinese Academy of Sciences), Wuliang Huang(Chinese Academy of Sciences), Yaning Li(Chinese Academy of Sciences), Yang Ge(Chinese Academy of Sciences), Jing Liu(Chinese Academy of Sciences), Guole Liu(Chinese Academy of Sciences), Liqun Zhong(Chinese Academy of Sciences), Yaoru Luo(Chinese Academy of Sciences), Jiaheng Zhou(Chinese Academy of Sciences), Zichen Wang(Chinese Academy of Sciences), Qinxuan Luo(Chinese Academy of Sciences), Ziwen Liu(Chinese Academy of Sciences), Ao Li(Chinese Academy of Sciences), Teng Wang(Chinese Academy of Sciences), Yiming Huang(Chinese Academy of Sciences), Handong Li(Chinese Academy of Sciences), Yong Wang(Chinese Academy of Sciences), Shihua Zhang(Chinese Academy of Sciences), Jiahao Zhang(Chinese Academy of Sciences), Yiyang Zhang(Chinese Academy of Sciences), Shirui Li(Chinese Academy of Sciences), Zhongming Liang(Chinese Academy of Sciences), Zhenpeng Man(Chinese Academy of Sciences), Kangning Dong(Chinese Academy of Sciences), Qunlun Shen(Chinese Academy of Sciences), Hongmei Wang(Chinese Academy of Sciences), Zhen Meng(Chinese Academy of Sciences), Xuezhi Wang(Chinese Academy of Sciences), Yangang Wang(Chinese Academy of Sciences), Yong Wang(Chinese Academy of Sciences), Shihua Zhang(Chinese Academy of Sciences), Jingtao Guo(Chinese Academy of Sciences), Yi Zhao(Chinese Academy of Sciences), Yuanchun Zhou(Chinese Academy of Sciences), Fei Li(Chinese Academy of Sciences), Jing Liu(Chinese Academy of Sciences), Yiqiang Chen(Chinese Academy of Sciences), Yang Ge(Chinese Academy of Sciences), Xin Li(Chinese Academy of Sciences)
Cell Research
October 7, 2024
Cited by 101Open Access
Full Text

Abstract

Deciphering universal gene regulatory mechanisms in diverse organisms holds great potential for advancing our knowledge of fundamental life processes and facilitating clinical applications. However, the traditional research paradigm primarily focuses on individual model organisms and does not integrate various cell types across species. Recent breakthroughs in single-cell sequencing and deep learning techniques present an unprecedented opportunity to address this challenge. In this study, we built an extensive dataset of over 120 million human and mouse single-cell transcriptomes. After data preprocessing, we obtained 101,768,420 single-cell transcriptomes and developed a knowledge-informed cross-species foundation model, named GeneCompass. During pre-training, GeneCompass effectively integrated four types of prior biological knowledge to enhance our understanding of gene regulatory mechanisms in a self-supervised manner. By fine-tuning for multiple downstream tasks, GeneCompass outperformed state-of-the-art models in diverse applications for a single species and unlocked new realms of cross-species biological investigations. We also employed GeneCompass to search for key factors associated with cell fate transition and showed that the predicted candidate genes could successfully induce the differentiation of human embryonic stem cells into the gonadal fate. Overall, GeneCompass demonstrates the advantages of using artificial intelligence technology to decipher universal gene regulatory mechanisms and shows tremendous potential for accelerating the discovery of critical cell fate regulators and candidate drug targets.


Related Papers

No related papers found

Powered by citation graph analysis