RL Squeezes, SFT Expands: A Comparative Study of Reasoning LLMs

Open in new window