Position: On the Methodological Pitfalls of Evaluating Base LLMs for Reasoning