Beyond variance reduction: Understanding the true impact of baselines on policy optimization