Does Thinking More always Help? Mirage of Test-Time Scaling in Reasoning Models

Open in new window