DebugBench: Evaluating Debugging Capability of Large Language Models

Open in new window