An Empirical Analysis of Measure-Valued Derivatives for Policy Gradients