completion
Matrix Completion has No Spurious Local Minimum
Matrix completion is a basic machine learning problem that has wide applications, especially in collaborative filtering and recommender systems. Simple non-convex optimization algorithms are popular and effective in practice. Despite recent progress in proving various non-convex algorithms converge from a good initial point, it remains unclear why random or arbitrary initialization suffices in practice. We prove that the commonly used non-convex objective function for matrix completion has no spurious local minima --- all local minima must also be global. Therefore, many popular optimization algorithms such as (stochastic) gradient descent can provably solve matrix completion with \textit{arbitrary} initialization in polynomial time.
Provable Efficient Online Matrix Completion via Non-convex Stochastic Gradient Descent
Matrix completion, where we wish to recover a low rank matrix by observing a few entries from it, is a widely studied problem in both theory and practice with wide applications. Most of the provable algorithms so far on this problem have been restricted to the offline setting where they provide an estimate of the unknown matrix using all observations simultaneously. However, in many applications, the online version, where we observe one entry at a time and dynamically update our estimate, is more appealing. While existing algorithms are efficient for the offline setting, they could be highly inefficient for the online setting. In this paper, we propose the first provable, efficient online algorithm for matrix completion. Our algorithm starts from an initial estimate of the matrix and then performs non-convex stochastic gradient descent (SGD). After every observation, it performs a fast update involving only one row of two tall matrices, giving near linear total runtime. Our algorithm can be naturally used in the offline setting as well, where it gives competitive sample complexity and runtime to state of the art algorithms. Our proofs introduce a general framework to show that SGD updates tend to stay away from saddle surfaces and could be of broader interests to other non-convex problems.
- Research Report > Experimental Study (0.93)
- Workflow (0.67)
- North America > United States > Virginia (0.04)
- North America > Canada (0.04)
- Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology (1.00)
- Leisure & Entertainment (0.93)
- Banking & Finance > Trading (0.92)
- (2 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- (2 more...)
- North America > United States > Oregon (0.04)
- Asia > Middle East > Israel (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
- Information Technology (0.67)
- Banking & Finance (0.45)
- Europe > Switzerland > Zürich > Zürich (0.14)
- Asia > China > Hong Kong (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (8 more...)