Better Process Supervision with Bi-directional Rewarding Signals