SCAN: Self-Denoising Monte Carlo Annotation for Robust Process Reward Learning

Open in new window