Goto

Collaborating Authors

 non-convex optimization



Adaptive Negative Curvature Descent with Applications in Non-convex Optimization

Neural Information Processing Systems

Negative curvature descent (NCD) method has been utilized to design deterministic or stochastic algorithms for non-convex optimization aiming at finding second-order stationary points or local minima. In existing studies, NCD needs to approximate the smallest eigen-value of the Hessian matrix with a sufficient precision (e.g., $\epsilon_2\ll 1$) in order to achieve a sufficiently accurate second-order stationary solution (i.e., $\lambda_{\min}(\nabla^2 f(\x))\geq -\epsilon_2)$. One issue with this approach is that the target precision $\epsilon_2$ is usually set to be very small in order to find a high quality solution, which increases the complexity for computing a negative curvature. To address this issue, we propose an adaptive NCD to allow for an adaptive error dependent on the current gradient's magnitude in approximating the smallest eigen-value of the Hessian, and to encourage competition between a noisy NCD step and gradient descent step. We consider the applications of the proposed adaptive NCD for both deterministic and stochastic non-convex optimization, and demonstrate that it can help reduce the the overall complexity in computing the negative curvatures during the course of optimization without sacrificing the iteration complexity.




SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path-Integrated Differential Estimator

Neural Information Processing Systems

We provide a few error-bound results on its convergence rates. Specially, we prove that theSPIDER-SFO algorithm achieves a gradient computation cost of O min(n1/2 2, 3) to find an -approximate first-order stationary point. In addition, we prove thatSPIDER-SFO nearly matches the algorithmic lower bound for finding stationary point under the gradient Lipschitz assumption in the finite-sum setting.


Quantum Algorithms for Non-smooth Non-convex Optimization

Neural Information Processing Systems

This paper considers the problem of finding the (δ,ǫ)-Goldstein stationary point of the Lipschitz continuous objective, which is a rich funct ion class to cover a large number of important applications. We construct a nove l zeroth-order quantum estimator for the gradient of the smoothed surrogate.