EvoCodeBench: An Evolving Code Generation Benchmark with Domain-Specific Evaluations

Neural Information Processing Systems 

How to evaluate Large Language Models (LLMs) in code generation remains an open question.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found