On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants