the performance of various LLMs on code synthesis. However, these test-cases can be limited in both quantity and quality for fully assessing the functional

Neural Information Processing Systems 

Program synthesis has been long studied with recent approaches focused on directly using the power of Large Language Models (LLMs) to generate code.