Deep Reinforcement Learning for Modelling Protein Complexes
Gao, Ziqi, Feng, Tao, You, Jiaxuan, Zi, Chenyi, Zhou, Yan, Zhang, Chen, Li, Jia
–arXiv.org Artificial Intelligence
AlphaFold can be used for both single-chain and multi-chain protein structure prediction, while the latter becomes extremely challenging as the number of chains increases. In this work, by taking each chain as a node and assembly actions as edges, we show that an acyclic undirected connected graph can be used to predict the structure of multi-chain protein complexes (a.k.a., protein complex modelling, PCM). To address these challenges, we propose GAPN, a Generative Adversarial Policy Network powered by domainspecific rewards and adversarial loss through policy gradient for automatic PCM prediction. Specifically, GAPN learns to efficiently search through the immense assembly space and optimize the direct docking reward through policy gradient. Importantly, we design an adversarial reward function to enhance the receptive field of our model. In this way, GAPN will simultaneously focus on a specific batch of complexes and the global assembly rules learned from complexes with varied chain numbers. Empirically, we have achieved both significant accuracy (measured by RMSD and TM-Score) and efficiency improvements compared to leading PCM softwares. AlphaFold-Multimer (Evans et al., 2021) has However, it faces difficulties in maintaining high accuracy when dealing with complexes with a larger number (> 9) of chains (Bryant et al., 2022a; Burke et al., 2023; Bryant et al., 2022b).
arXiv.org Artificial Intelligence
May-6-2024