OptimalThinkingBench: Evaluating Over and Underthinking in LLMs

Open in new window