ROC-n-reroll: How verifier imperfection affects test-time scaling

Open in new window