Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains

Open in new window