Promoting Efficient Reasoning with Verifiable Stepwise Reward

Open in new window