Why Distillation can Outperform Zero-RL: The Role of Flexible Reasoning

Open in new window