Reviews: Stochastic Cubic Regularization for Fast Nonconvex Optimization

Neural Information Processing Systems 

This submission is interested in stochastic nonconvex optimization problems, in which only stochastic estimates of the objective and its derivatives can be accessed at every iteration. The authors develop a variant of the cubic regularization framework, that only requires access to stochastic gradients and products of stochastic Hessians with vectors. Such a method is shown to reach a point at which the gradient norm is smaller than \epsilon and the minimum Hessian eigenvalue is bigger than -\sqrt{\rho \epsilon} in a number of stochastic queries (gradient or Hessian-vector product) of order of \epsilon {-3.5}, which improves over the classical complexity of Stochastic Gradient Descent (SGD). The problem of interest is clearly introduced, along with the possible advantages of using both stochastic estimates and a cubic regularization framework. The associated literature is correctly reviewed, and the authors even cite contemporary work that achieves similar complexity guarantees but rely on stochastic gradient estimates and variance reduction.