R1-Code-Interpreter: LLMs Reason with Code via Supervised and Multi-stage Reinforcement Learning

Open in new window