Position: On the Methodological Pitfalls of Evaluating Base LLMs for Reasoning

Open in new window