Right Is Not Enough: The Pitfalls of Outcome Supervision in Training LLMs for Math Reasoning

Open in new window