Enhancing Reinforcement learning in 3-Dimensional Hydrophobic-Polar Protein Folding Model with Attention-based layers

Liu, Peizheng, Iba, Hitoshi

arXiv.org Artificial Intelligence 

Transformer-based architectures have recently propelled advances in sequence modeling across domains, but their application to the hydrophobic-hydrophilic (H-P) model for protein folding remains relatively unexplored. In this work, we adapt a Deep Q-Network (DQN) integrated with attention mechanisms (Transformers) to address the 3D H-P protein folding problem. Our system formulates folding decisions as a self-avoiding walk in a reinforced environment, and employs a specialized reward function based on favorable hydrophobic interactions. T o improve performance, the method incorporates validity check including symmetry-breaking constraints, dueling and double Q-learning, and prioritized replay to focus learning on critical transitions. Experimental evaluations on standard benchmark sequences demonstrate that our approach achieves several known best solutions for shorter sequences, and obtains near-optimal results for longer chains. This study underscores the promise of attention-based reinforcement learning for protein folding, and created a prototype of Transformer-based Q-network structure for 3-dimensional lattice models. 1 1 Introduction H-P model has been considered as a simplified model for protein structure prediction. However, optimizing the structure of H-P model still requires efficient algorithms due to the large solution space. Determining the optimal structure of proteins under the hydrophobic-hydrophilic (HP) model has been rigorously shown to be NP-complete ( 1), highlighting the necessity for powerful heuristic or approximation methods in lieu of brute-force searches. Among heuristic approaches, Monte Carlo simulations are particularly popular and exhibit a wide range of implementations ( 2) ( 3).