From Mathematical Reasoning to Code: Generalization of Process Reward Models in Test-Time Scaling

Open in new window