MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs Yingjia Wan 2 Jingyao Li1