SPARK: Stepwise Process-Aware Rewards for Reference-Free Reinforcement Learning

Open in new window