GeoPep: A geometry-aware masked language model for protein-peptide binding site prediction
Chen, Dian, Chen, Yunkai, Lin, Tong, Chen, Sijie, Cheng, Xiaolin
–arXiv.org Artificial Intelligence
Multimodal approaches that integrate protein structure and sequence have achieved remarkable success in protein-protein interface prediction. However, extending these methods to protein-peptide interactions remains challenging due to the inherent conformational flexibility of peptides and the limited availability of structural data that hinder direct training of structure-aware models. To address these limitations, we introduce GeoPep, a novel framework for peptide binding site prediction that leverages transfer learning from ESM3, a multimodal protein foundation model. GeoPep fine-tunes ESM3's rich pre-learned representations from protein-protein binding to address the limited availability of protein-peptide binding data. The fine-tuned model is further integrated with a parameter-efficient neural network architecture capable of learning complex patterns from sparse data. Furthermore, the model is trained using distance-based loss functions that exploit 3D structural information to enhance binding site prediction. Comprehensive evaluations demonstrate that GeoPep significantly outperforms existing methods in protein-peptide binding site prediction by effectively capturing sparse and heterogeneous binding patterns.
arXiv.org Artificial Intelligence
Nov-3-2025
- Country:
- North America > United States
- Maryland > Baltimore (0.04)
- Ohio > Franklin County
- Columbus (0.04)
- North America > United States
- Genre:
- Research Report > New Finding (0.93)
- Industry:
- Technology: