Sage Journals: Discover world-class research

Abstract

Deep learning has become a powerful paradigm in computational biology, offering data-driven models for mapping protein sequences to structure and function with high precision. In the context of protein structure prediction and functional annotation, traditional sequence-alignment and template-based methods are often limited by low homology and structural diversity. To overcome these challenges, we propose a unified deep learning framework that integrates sequence modeling, structural representation, and functional inference into a single architecture. The system employs transformer-based encoders to extract contextual features from amino acid sequences and graph neural networks to capture spatial dependencies within predicted structures. A multi-task learning approach is designed to perform C_α backbone reconstruction and enzyme class prediction simultaneously. The framework leverages joint training on sequence features and predicted inter-residue geometry to improve generalization on rare or multifunctional proteins. Experiments on public benchmark datasets such as CASP14 and CAMEO-Hard demonstrate a 12.4% reduction in backbone RMSD compared to AlphaFold2 and a 9.7% improvement in PR-AUC for contact prediction, validating the effectiveness of joint learning and structural integration. Compared to existing state-of-the-art baselines including RoseTTAFold and trRosetta, our system achieves consistently superior accuracy across structure and function tasks. This work provides a modular, end-to-end solution for large-scale protein analysis and shows potential for extension to downstream tasks such as mutation effect prediction or protein–protein interaction modeling.

Keywords

deep learning protein structure prediction functional annotation transformer graph neural networks multi-task learning

Get full access to this article

View all access options for this article.

References

Zhang

Song

Zeng

, et al. Deepfunc: a deep learning framework for accurate prediction of protein functions from protein sequences and interactions. Proteomics 2019; 19(12): 1900019.

Wang

Jiang

Jin

, et al. Deepbio: an automated and interpretable deep-learning platform for high-throughput biological sequence prediction, functional annotation and visualization analysis. Nucleic Acids Res 2023; 51(7): 3017–3029.

Boadu

Lee

Cheng

. Deep learning methods for protein function prediction. Proteomics 2025; 25(1-2): 2300471.

Xia

Zheng

Fang

, et al. Pfmuldl: a novel strategy enabling multi-class and multi-label protein function annotation by integrating diverse deep learning methods. Comput Biol Med 2022; 145: 105465.

Soleymani

Paquet

Viktor

, et al. Protein–protein interaction prediction with deep learning: a comprehensive review. Comput Struct Biotechnol J 2022; 20: 5316–5341.

Ispano

Bianca

Lavezzo

, et al. An overview of protein function prediction methods: a deep learning perspective. Curr Bioinf 2023; 18(8): 621–630.

Gligorijević

Renfrew

Kosciolek

, et al. Structure-based protein function prediction using graph convolutional networks. Nat Commun 2021; 12(1): 3168.

Jiang

Shen

Y-Y

Liu

. Structure-based prediction of nucleic acid binding residues by merging deep learning-and template-based approaches. PLoS Comput Biol 2023; 19(9): e1011428.

Zheng

Shi

, et al. Annopro: a strategy for protein function annotation based on multi-scale protein representation and a hybrid deep learning of dual-path encoding. Genome Biol 2024; 25(1): 41.

10.

Pan

, et al. Pfresgo: an attention mechanism-based deep-learning approach for protein annotation by integrating gene ontology inter-relationships. Bioinformatics 2023; 39(3): btad094.

11.

Xia

Zhao

Liu

, et al. Multi-domain and complex protein structure prediction using inter-domain interactions from deep learning. Commun Biol 2023; 6(1): 1221.

12.

Lei

Liu

, et al. A deep-learning framework for multi-level peptide–protein interaction prediction. Nat Commun 2021; 12(1): 5465.

13.

Cai

Wang

Deng

. Sdn2go: an integrated deep learning model for protein function prediction. Front Bioeng Biotechnol 2020; 8: 391.

14.

Zhang

Meng

, et al. Machine learning for sequence and structure-based protein–ligand interaction prediction. J Chem Inf Model 2024; 64(5): 1456–1472.

15.

Dhanuka

Singh

Tripathi

. A comprehensive survey of deep learning techniques in protein function prediction. IEEE ACM Trans Comput Biol Bioinf 2023; 20(3): 2291–2301.

16.

Chen

Lai

, et al. In silico protein function prediction: the rise of machine learning-based approaches. Mediev Rev 2023; 3(6): 487–510.

17.

Chen

Nasif

KFA

, et al. Ai-driven deep learning techniques in protein structure prediction. Int J Mol Sci 2024; 25(15): 8426.

18.

Pan

Shen

H-B

. Rna-protein binding motifs mining with a new hybrid deep learning based cross-domain knowledge integration approach. BMC Bioinf 2017; 18: 1–14.

19.

Chen

Ain

Nu.

Zhao

, et al. From tradition to innovation: conventional and deep learning frameworks in genome annotation. Briefings Bioinf 2024; 25(3): bbae138.

20.

Feng

Ling

, et al. Deep learning frameworks for protein–protein interaction prediction. Comput Struct Biotechnol J 2022; 20: 3223–3233.

Deep learning-based integrated framework for protein structure prediction and functional annotation

Abstract

Keywords

Get full access to this article

References