ineffectiveness
On the Ineffectiveness of Variance Reduced Optimization for Deep Learning
The application of stochastic variance reduction to optimization has shown remarkable recent theoretical and practical success. The applicability of these techniques to the hard non-convex optimization problems encountered during training of modern deep neural networks is an open problem. We show that naive application of the SVRG technique and related approaches fail, and explore why.
Reviews: On the Ineffectiveness of Variance Reduced Optimization for Deep Learning
I'm glad you commented on the learning rate selection, because this was a major point of our discussion. The main reason I can't increase my score is that many definitions, explanations and experiment details are missing, making it extremely hard to evaluate the real value of your experiments. This was additionally complicated by the fact that you didn't provide your code when submitting the paper. I hope that you will do a major revision, for example include a section in supplementary material with all experiments details. Just in case, here are my suggestions for some extra experiments: 1. Measure and plot the effect of data augmentation on bias of the gradient.
Reviews: On the Ineffectiveness of Variance Reduced Optimization for Deep Learning
The reviewers thought this paper provided an interesting analysis of the lack of variance reduction methods for deep nets. They however raised several concerns about the lack of details about the experimental setup, especially since such details can affect the outcome of the paper. This is true that, for such papers, there are always more experiments that the authors can run to broaden or strengthen their statements. That being said, I decided to accept the paper for the following reasons: - Requiring a large number of experiments for a paper to be accepted to NeurIPS will preclude those with limited compute resources to have papers published. Thus, accepting this paper will provide a basis for experiments for other researchers.
On the Ineffectiveness of Variance Reduced Optimization for Deep Learning
The application of stochastic variance reduction to optimization has shown remarkable recent theoretical and practical success. The applicability of these techniques to the hard non-convex optimization problems encountered during training of modern deep neural networks is an open problem. We show that naive application of the SVRG technique and related approaches fail, and explore why.
On the Ineffectiveness of Variance Reduced Optimization for Deep Learning
The application of stochastic variance reduction to optimization has shown remarkable recent theoretical and practical success. The applicability of these techniques to the hard non-convex optimization problems encountered during training of modern deep neural networks is an open problem. We show that naive application of the SVRG technique and related approaches fail, and explore why. Papers published at the Neural Information Processing Systems Conference.