A Comprehensively Tight Analysis of Gradient Descent for PCA
–Neural Information Processing Systems
We study the Riemannian gradient method for PCA on which a crucial fact is that despite the simplicity of the considered setting, i.e., deterministic version of Krasulina's method, the convergence rate has not been well-understood yet. In this work, we provide a general tight analysis for the gap-dependent rate at O(\frac{1}{\Delta}\log\frac{1}{\epsilon}) that holds for any real symmetric matrix. More importantly, when the gap \Delta is significantly smaller than the target accuracy \epsilon on the objective sub-optimality of the final solution, the rate of this type is actually not tight any more, which calls for a worst-case rate. We further give the first worst-case analysis that achieves a rate of convergence at O(\frac{1}{\epsilon}\log\frac{1}{\epsilon}) . Particularly, our gap-dependent analysis suggests a new promising learning rate for stochastic variance reduced PCA algorithms.
Neural Information Processing Systems
Jan-18-2025, 21:03:04 GMT
- Technology: