Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning

Jun-14-2026, 05:10:47 GMT–Neural Information Processing Systems

Mathematical reasoning in large language models has been successfully incentivized through reinforcement learning with verifiable rewards, leading to improved one-shot precision. In this work, we turn our focus to the coding domain. Beyond one-shot precision, we highlight unit test generation as another key factor for enhancing coding ability, since accurate unit tests are essential for enabling self-checking and self-correction during inference. Traditional approaches for fine-tuning LLMs on unit test generation rely heavily on ground-truth code solutions in the training data. We propose CURE, a novel reinforcement learning framework with a dedicated reward design that co-evolves coding and unit test generation capabilities based on their interaction outcomes--without any ground-truth code as supervision.

large language model, machine learning, reinforcement learning, (10 more...)

Neural Information Processing Systems

Jun-14-2026, 05:10:47 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (0.62)
  - Machine Learning > Reinforcement Learning (0.55)