COMPASS: A Multi-Dimensional Benchmark for Evaluating Code Generation in Large Language Models

Open in new window