On the Ineffectiveness of Variance Reduced Optimization for Deep Learning