Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning