Towards Better Correctness and Efficiency in Code Generation
Feng, Yunlong, Xu, Yang, Xu, Xiao, Hui, Binyuan, Lin, Junyang
–arXiv.org Artificial Intelligence
While code large language models have demonstrated remarkable progress in code generation, the generated code often exhibits poor runtime efficiency, limiting its practical application in performance-sensitive scenarios. To address this limitation, we propose an efficiency-oriented reinforcement learning framework guided by a novel performance reward. Based on this framework, we take a deeper dive into the code efficiency problem, identifying then proposing methods to overcome key bottlenecks: (1) Dynamic exploration overcomes the static data constraints of offline fine-tuning, enabling the discovery of more efficient code implementations. (2) The error-insensitive reinforcement learning method and high-contrast efficiency signals are crucial for mitigating systematic errors and achieving effective optimization. (3) Online exploration is most effective when starting from a high-correctness baseline, as this allows for efficiency improvements without sacrificing accuracy. With these discoveries, we finally propose a two-stage tuning method, which achieves high and balanced performance across correctness and efficiency. The results of experiments show the effectiveness of the method, which improves code correctness by 10.18\% and runtime efficiency by 7.75\% on a 7B model, achieving performance comparable to much larger model.
arXiv.org Artificial Intelligence
Aug-29-2025
- Country:
- Asia
- Europe > Austria
- Vienna (0.14)
- North America
- Canada > British Columbia
- Vancouver (0.04)
- United States > Louisiana
- Orleans Parish > New Orleans (0.04)
- Canada > British Columbia
- Genre:
- Research Report (0.82)
- Technology: