ReST-RL: Achieving Accurate Code Reasoning of LLMs with Optimized Self-Training and Decoding

Open in new window