ReflectionCoder: Learning from Reflection Sequence for Enhanced One-off Code Generation
Ren, Houxing, Zhan, Mingjie, Wu, Zhongyuan, Zhou, Aojun, Pan, Junting, Li, Hongsheng
–arXiv.org Artificial Intelligence
Code generation plays a crucial role in various tasks, such as code auto-completion and mathematical reasoning. Previous work has proposed numerous methods to enhance code generation performance, including integrating feedback from the compiler. Inspired by this, we present ReflectionCoder, a novel approach that effectively leverages reflection sequences constructed by integrating compiler feedback to improve one-off code generation performance. Furthermore, we propose reflection self-distillation and dynamically masked distillation to effectively utilize these reflection sequences. Extensive experiments on three benchmarks, i.e., HumanEval (+), MBPP (+), and MultiPl-E, demonstrate that models fine-tuned with our method achieve state-of-the-art performance. Notably, ReflectionCoder-DeepSeek-Coder-33B reaches pass@1 of 82.9 (76.8) on HumanEval (+) and 84.1 (72.0) on MBPP (+), on par with GPT-3.5-Turbo and Claude-3-opus, and surpasses early GPT-4. Beyond the code domain, we believe this approach can benefit other domains that focus on final results and require long reasoning paths. Code and data are available at https://github.com/SenseLLM/ReflectionCoder.
arXiv.org Artificial Intelligence
May-27-2024
- Country:
- North America
- Canada > Quebec (0.14)
- United States
- Hawaii (0.14)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- North America
- Genre:
- Research Report > Promising Solution (0.66)
- Technology: