CriticBench: Evaluating Large Language Models as Critic

Open in new window