The Price of a Second Thought: On the Evaluation of Reasoning Efficiency in Large Language Models