Position: The Hidden Costs and Measurement Gaps of Reinforcement Learning with Verifiable Rewards

Open in new window