Computational Benefits of Intermediate Rewards for Goal-Reaching Policy Learning