Integrating ESM-2 and Graph Neural Networks with AlphaFold-2 Structures for Enhanced Protein Function Prediction

Thi-Tuyen Nguyen(Thai Nguyen University), Zhuocheng Jiang(Yong In University), Van-Nui Nguyen(Thai Nguyen University), Nguyen Quoc Khanh Le(Taipei Medical University Hospital), Matthew Chin Heng Chua(Yong In University)
ACS Omega
August 16, 2025
Cited by 9Open Access
Full Text

Abstract

Protein function prediction is essential for elucidating biological processes and accelerating drug discovery. However, the vast number of unannotated protein sequences and the limited availability of experimentally validated functional data remain major challenges. Although deep learning models based on protein sequences or protein-protein interaction networks have shown promise, their performance is still restricted, particularly for proteins without interaction data. Furthermore, many existing approaches treat sequence and structural information separately, potentially resulting in suboptimal feature representations. To address these limitations, we propose an improved graph-based framework that integrates two key innovations: (i) ESM-2, a state-of-the-art protein language model, to generate semantically rich sequence embeddings; and (ii) a hybrid pooling mechanism within graph convolutional blocks to better capture both global and local structural features from AlphaFold2-predicted structures. Experiments on the human proteome demonstrate that our model consistently outperforms existing methods in predicting molecular function, cellular component, and biological process annotations. These findings highlight the advantages of combining advanced sequence representations with enhanced structural learning for accurate and generalizable protein function prediction.


Related Papers

No related papers found

Powered by citation graph analysis