Reinforcement learning on structure-conditioned categorical diffusion for protein inverse folding