Preconditioned Gradient Descent for Over-Parameterized Nonconvex Matrix Factorization
–Neural Information Processing Systems
In practical instances of nonconvex matrix factorization, the rank of the true solution r {\star} is often unknown, so the rank r of the model can be over-specified as r r {\star} . This over-parameterized regime of matrix factorization significantly slows down the convergence of local search algorithms, from a linear rate with r r {\star} to a sublinear rate when r r {\star} . We propose an inexpensive preconditioner for the matrix sensing variant of nonconvex matrix factorization that restores the convergence rate of gradient descent back to linear, even in the over-parameterized case, while also making it agnostic to possible ill-conditioning in the ground truth. Classical gradient descent in a neighborhood of the solution slows down due to the need for the model matrix factor to become singular. Our key result is that this singularity can be corrected by \ell_{2} regularization with a specific range of values for the damping parameter.
Neural Information Processing Systems
Oct-9-2024, 23:07:38 GMT
- Technology: