Escape saddle points by a simple gradient-descent based algorithm
–Neural Information Processing Systems
Escaping saddle points is a central research topic in nonconvex optimization. In this paper, we propose a simple gradient-based algorithm such that for a smooth function f\colon\mathbb{R} n\to\mathbb{R}, it outputs an \epsilon -approximate second-order stationary point in \tilde{O}(\log n/\epsilon {1.75}) iterations. Technically, our main contribution is an idea of implementing a robust Hessian power method using only gradients, which can find negative curvature near saddle points and achieve the polynomial speedup in \log n compared to the perturbed gradient descent methods. Finally, we also perform numerical experiments that support our results.
Neural Information Processing Systems
Oct-10-2024, 05:34:05 GMT
- Technology: