Protein Function Prediction


Overview
  1. 深度学习在蛋白质功能预测中的应用. 合成生物学, 2023. [PDF]

  2. A comprehensive computational benchmark for evaluating deep learning-based protein function prediction approaches. Briefings in Bioinformatics, 2024. [PDF]

  3. Deep learning methods for protein function prediction. Proteomics, 2024. [PDF]

  4. Protein function prediction with gene ontology: from traditional to deep learning models. PeerJ, 2021. [PDF]

  5. The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens. Genome Biology, 2019. [PDF]

  6. Protein function annotation using protein domain family resources. Methods, 2016. [PDF]

Evaluation Metrics
  1. A large-scale assessment of sequence database search tools for homology-based protein function prediction. bioRxiv, 2023. [PDF]

  2. Evaluation: A large-scale evaluation of computational protein function prediction. Nature Methods, 2013. [PDF]

Toolbars
  1. InterPro in 2022. Nucleic Acids Research, 2022. [PDF] [Web Server]

Public Database
  1. Protein sequence database: UniProt.

  2. Protein structure database: PDB, AlphaFold database.

  3. Protein function database: Gene Ontology, GOA.

  4. Protein-ligand strcture database: BioLip.

  5. Gene co-expression database: COXPRESdb, ATTED-II.

Template-Based Methods
  1. QAUST: Protein function prediction using structure similarity, protein interaction, and functional motifs. Genomics Proteomics Bioinformatics, 2021. Source: structure, protein-protein network, and functional motifs. [PDF]

  2. MLC: Metric learning on expression data for gene function prediction. Bioinformatics, 2020. Source: gene expression. [PDF] [Code]

  3. INGA 2.0: Improving protein function prediction for the dark proteome. Nucleic Acids Research, 2019. Source: sequenc,protein-protein network, domain. [PDF] [Web Server]

  4. MetaGO: Predicting gene ontology of non-homologous proteins through low-resolution protein structure prediction and protein-protein network mapping. Journal of Molecular Biology, 2018. Source: sequence, structure, and protein-protein network. [PDF] [Web Server]

  5. FunFams: Functional classification of CATH superfamilies: a domain-based approach for protein function annotation. Bioinformatics, 2017. Source: family and domain. [PDF] [Web Server]

  6. COFACTOR 2.0: Improved protein function prediction by combining structure, sequence and protein-protein interaction information. Nucleic Acids Research, 2017. Source: sequence, structure, and protein-protein network. [PDF] [Web Server]

  7. GoFDR: A sequence alignment based method for predicting protein functions. Methods, 2017. Source: sequence. [PDF]

  8. INGA: Protein function prediction combining interaction networks, domain assignments and sequence similarity. Nucleic Acids Research, 2015. Source: sequenc,protein-protein network, domain. [PDF]

  9. MS-kNN: Protein function prediction by integrating multiple data sources. BMC Bioinformatics, 2013. Source: sequence, protein-protein network, and gene expression. [PDF]

  10. dcGO: A domain-centric solution to functional genomics via dcGO predictor. BMC Bioinformatics, 2013. Source: domain. [PDF] [Web Server]

  11. COFACTOR 1.0: An accurate comparative algorithm for structure-based protein function annotation. Nucleic Acids Research, 2012. Source: structure. [PDF]

  12. FINDSITE: A combined evolution/structure-based approach to protein function prediction. Briefings in Bioinformatics, 2009. Source: structure. [PDF]

  13. MultiPfam2GO: Predicting protein function from domain content. Bioinformatics, 2009. Source: domain. [PDF]

  14. Exploiting indirect neighbours and topological weight to predict protein function from protein–protein interactions. Bioinformatics, 2006. Source: protein-protein network. [PDF]

  15. ProFunc: A server for predicting protein function from 3D structure. Nucleic Acids Research, 2005. Source: structure. [PDF]

  16. Blast2GO: A universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics, 2005. Source: sequence. [PDF]

  17. GOtcha: A new method for prediction of protein function assessed by the annotation of seven genomes. BMC Bioinformatics, 2004. Source: sequence. [PDF]

  18. Global protein function prediction from protein-protein interaction networks. Nature Biotechnology, 2003. Source: protein-protein network. [PDF]

  19. Predicting protein function from protein/protein interaction data: a probabilistic approach. Bioinformatics, 2003. Source: protein-protein network. [PDF]

Machine Learning-Based Methods
  1. NetGO: Improving large-scale protein function prediction with massive network information. Nucleic Acids Research, 2019. [PDF]

  2. GOLabeler: Improving sequence-based large-scale protein function prediction by learning to rank. Bioinformatics, 2018. [PDF]

  3. FFPred: an integrated feature-based function prediction server for vertebrate proteomes. Nucleic Acids Research, 2008. [PDF]

Deep Learning-Based Methods
  1. DeepSS2GO: Protein function prediction from secondary structure. Briefings in Bioinformatics, 2024. [PDF] [Code]

  2. MSF-PFP: A novel multisource feature fusion model for protein function prediction. Journal of Chemical Information and Modeling, 2024. [PDF] [Code]

  3. Chemical-SA-BiLSTM: Grain protein function prediction based on self-attention mechanism and bidirectional LSTM. Briefings in Bioinformatics, 2023. [PDF] [Code]

  4. ProteInfer: Deep neural networks for protein functional inference. eLife, 2023. [PDF] [Code]

  5. DeepPFP-CO: A deep learning framework for predicting protein functions with co-occurrence of GO terms. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2023. [PDF] [Web Server]

  6. TripletGO: Integrating transcript expression profiles with protein homology inferences for gene function prediction. Genomics Proteomics Bioinformatics, 2022. [PDF] [Web Server]

  7. DeepGOZero: Improving protein function prediction from sequence and zero-shot learning based on ontology axioms. Bioinformatics, 2022. [PDF] [Code]

  8. NetGO 2.0: Improving large-scale protein function prediction with massive sequence, text, domain, family and network information. Nucleic Acids Research, 2021. [PDF]

  9. DeepGOA: A deep learning framework for gene ontology annotations with sequence- and network-based information. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2021. [PDF] [Code]

  10. NetQuilt: Deep multispecies network-based protein function prediction using homology-informed network similarity. Bioinformatics , 2021. [PDF] [Code]

  11. MultiPredGO: Deep multi-modal protein function prediction by amalgamating protein structure, sequence, and interaction information. IEEE Journal of Biomedical and Health Informatics , 2021. [PDF] [Code]

  12. TALE: Transformer-based protein function Annotation with joint sequence-Label Embedding. Bioinformatics, 2021. [PDF] [Code]

  13. DeepGraphGO: graph neural network for large-scale, multispecies protein function prediction. Bioinformatics, 2021. [PDF] [Code]

  14. Deep_CNN_LSTM_GO: Protein function prediction from amino-acid sequences. Computational Biology and Chemistry, 2021. [PDF]

  15. FFPred-GAN: Protein function prediction is improved by creating synthetic feature samples with generative adversarial networks. Nature Machine Intelligence, 2020. [PDF] [Code]

  16. DeepAdd: Protein function prediction from k-mer embedding and additional features. Computational Biology and Chemistry, 2020. [PDF]

  17. DeepGOPlus: Improved protein function prediction from sequence. Bioinformatics, 2020. [PDF] [Web Server]

  18. DeepGO: Predicting protein functions from sequence and interactions using a deep ontology-aware classifier. Bioinformatics, 2018. [PDF] [Code]

Pre-Trained Model-Based Methods
  1. AnnoPRO: A strategy for protein function annotation based on multi-scale protein representation and a hybrid deep learning of dual-path encoding. Genome Biology, 2024. [PDF] [Code]

  2. DualNetGO: A dual network model for protein function prediction via effective feature selection. Bioinformatics, 2024. [PDF] [Code]

  3. Domain-PFP: Protein function prediction using function-aware domain embedding representations. Communications Biology, 2023. [PDF] [Code]

  4. CFAGO: Cross-fusion of network and attributes based on attention mechanism for protein function prediction. Bioinformatics, 2023. [PDF] [Code]

  5. MELISSA: Semi-supervised embedding for protein function prediction across multiple networks bioRxiv, 2023. [PDF] [Code]

  6. HiFun: Homology independent protein function prediction by a novel protein-language self-attention model. Briefings in Bioinformatics, 2023. [PDF] [Code]

  7. PredGO: Large-scale predicting protein functions through heterogeneous feature fusion. Briefings in Bioinformatics, 2023. [PDF] [Code]

  8. MGEGFP: A multi-view graph embedding method for gene function prediction based on adaptive estimation with GCN. Briefings in Bioinformatics, 2023. [PDF] [Code]

  9. MMSMAPlus: A multi-view multi-scale multi-attention embedding model for protein function prediction. Briefings in Bioinformatics, 2023. [PDF] [Code]

  10. PFmulDL: A novel strategy enabling multi-class and multi-label protein function annotation by integrating diverse deep learning methods. Computers in Biology and Medicine, 2022. [PDF] [Code]

  11. DeepFRI: Structure-based protein function prediction using graph convolutional networks. Nature Communications, 2021. [PDF] [Web Server] [Code]

  12. deepNF: Deep network fusion for protein function prediction. Bioinformatics, 2018. [PDF] [Code]

Large Language Model-Based Methods
  1. DeepGO-SE: Protein function prediction as approximate semantic entailment. Nature Machine Intelligence, 2024. Model: ESM2. [PDF] [Code]

  2. PhiGnet: Accurate prediction of protein function using statistics-informed graph networks. Nature Communications, 2024. Model: ESM-1b. [PDF] [Code]

  3. GPSFun: Geometry-aware protein sequence function predictions with language models. Nucleic Acids Research, 2024. Model: ESM2. [PDF] [Web Server]

  4. GO-LTR: Protein function prediction through multi‑view multi‑label latent tensor reconstruction. BMC Bioinformatics, 2024. Model: ProtTrans. [PDF] [Code]

  5. TransFew: Improving protein function prediction by learning and integrating representations of protein sequences and function labels. bioRxiv, 2024. Model: ESM2. [PDF] [Code]

  6. DeepGOMeta: Predicting functions for microbes. bioRxiv, 2024. Model: ESM2. [PDF] [Code]

  7. GNNGO3D: Protein function prediction based on 3D structure and functional hierarchy learning. IEEE Transactions on Knowledge and Data Engineering, 2023. Model: ESM-1b. [PDF]

  8. Struct2GO: Protein function prediction based on graph pooling algorithm and AlphaFold2 structure information. Bioinformatics, 2023. Model: SeqVec. [PDF] [Code]

  9. SPROF-GO: Fast and accurate protein function prediction from sequence through pretrained language model and homology-based label diffusion. Briefings in Bioinformatics, 2023. Model: ProtTrans. [PDF] [Code]

  10. HNetGO: Protein function prediction via heterogeneous network transformer. Briefings in Bioinformatics, 2023. Model: SeqVec. [PDF] [Code]

  11. NetGO 3.0: A protein language model improves large-scale functional Annotations. Genomics Proteomics Bioinformatics, 2023. Model: ESM-1b. [PDF] [Web Server]

  12. HEAL: Hierarchical graph transformer with contrastive learning for protein function prediction. Bioinformatics, 2023. Model: ESM-1b. [PDF] [Code]

  13. TransFun: Combining protein sequences and structures with transformers and equivariant graph neural networks to predict protein function. Bioinformatics, 2023. Model: ESM-1b. [PDF] [Web Server]

  14. TEMPROT: Protein function annotation using transformers embeddings and homology search. BMC Bioinformatics, 2023. Model: ProtBERT-BFD. [PDF] [Web Server]

  15. PANDA2: Protein function prediction using graph neural networks. NAR Genomics and Bioinformatics, 2022. Model: ESM-1b. [PDF] [Web Server]

  16. ATGO: Integrating unsupervised language model with triplet neural networks for protein gene ontology prediction. PLoS Computational Biology, 2022. Model: ESM-1b. [PDF] [Web Server]

  17. GAT-GO: Accurate protein function prediction via graph attention networks with predicted structure information. Briefings in Bioinformatics, 2022. Model: ESM-1b. [PDF]

  18. GOPredSim: Embeddings from deep learning transfer GO annotations beyond homology. Scientific Reports, 2021. Model: SeqVec. [PDF] [Web Server] [Code]