The Role of Baselines in Policy Gradient Optimization