The True Impact of Baselines in Policy Gradient Methods – Marlos C. Machado