Improving Sampling Efficiency in RLVR through Adaptive Rollout and Response Reuse

Open in new window