Beyond Synthetic Benchmarks: Evaluating LLM Performance on Real-World Class-Level Code Generation

Open in new window