A Comprehensively Tight Analysis of Gradient Descent for PCA

Jan-18-2025, 21:03:04 GMT–Neural Information Processing Systems

We study the Riemannian gradient method for PCA on which a crucial fact is that despite the simplicity of the considered setting, i.e., deterministic version of Krasulina's method, the convergence rate has not been well-understood yet. In this work, we provide a general tight analysis for the gap-dependent rate at O(\frac{1}{\Delta}\log\frac{1}{\epsilon}) that holds for any real symmetric matrix. More importantly, when the gap \Delta is significantly smaller than the target accuracy \epsilon on the objective sub-optimality of the final solution, the rate of this type is actually not tight any more, which calls for a worst-case rate. We further give the first worst-case analysis that achieves a rate of convergence at O(\frac{1}{\epsilon}\log\frac{1}{\epsilon}) . Particularly, our gap-dependent analysis suggests a new promising learning rate for stochastic variance reduced PCA algorithms.

comprehensively tight analysis, frac, gradient descent, (3 more...)

Neural Information Processing Systems

Jan-18-2025, 21:03:04 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.40)