Relaxed Sequence Sampling for Diverse Protein Design
Ko, Joohwan, Rontogiannis, Aristofanis, Ban, Yih-En Andrew, Elaldi, Axel, Franklin, Nicholas
–arXiv.org Artificial Intelligence
Protein design using structure prediction models such as AlphaFold2 has shown remarkable success, but existing approaches like relaxed sequence optimization (RSO) rely on single-path gradient descent and ignore sequence-space constraints, limiting diversity and designability. We introduce Relaxed Sequence Sampling (RSS), a Markov chain Monte Carlo (MCMC) framework that integrates structural and evolutionary information for protein design. RSS operates in continuous logit space, combining gradient-guided exploration with protein language model-informed jumps. Its energy function couples AlphaFold2-derived structural objectives with ESM2-derived sequence priors, balancing accuracy and biological plausibility. In an in silico protein binder design task, RSS produces 5$\times$ more designable structures and 2-3$\times$ greater structural diversity than RSO baselines, at equal computational cost. These results highlight RSS as a principled approach for efficiently exploring the protein design landscape.
arXiv.org Artificial Intelligence
Oct-29-2025
- Country:
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- Genre:
- Research Report (0.40)
- Industry:
- Technology: