NaturalCodeBench: Examining Coding Performance Mismatch on HumanEval and Natural User Prompts

Open in new window