Act Only When It Pays: Efficient Reinforcement Learning for LLM Reasoning via Selective Rollouts

Open in new window