Goto

Collaborating Authors

 robust recovery


Rank Overspecified Robust Matrix Recovery: Subgradient Method and Exact Recovery

Neural Information Processing Systems

We study the robust recovery of a low-rank matrix from sparsely and grossly corrupted Gaussian measurements, with no prior knowledge on the intrinsic rank. We consider the robust matrix factorization approach. We employ a robust $\ell_1$ loss function and deal with the challenge of the unknown rank by using an overspecified factored representation of the matrix variable. We then solve the associated nonconvex nonsmooth problem using a subgradient method with diminishing stepsizes. We show that under a regularity condition on the sensing matrices and corruption, which we call restricted direction preserving property (RDPP), even with rank overspecified, the subgradient method converges to the exact low-rank solution at a sublinear rate. Moreover, our result is more general in the sense that it automatically speeds up to a linear rate once the factor rank matches the unknown rank.


Robust Recovery via Implicit Bias of Discrepant Learning Rates for Double Over-parameterization

Neural Information Processing Systems

Recent advances have shown that implicit bias of gradient descent on over-parameterized models enables the recovery of low-rank matrices from linear measurements, even with no prior knowledge on the intrinsic rank. In contrast, for {\em robust} low-rank matrix recovery from {\em grossly corrupted} measurements, over-parameterization leads to overfitting without prior knowledge on both the intrinsic rank and sparsity of corruption. This paper shows that with a {\em double over-parameterization} for both the low-rank matrix and sparse corruption, gradient descent with {\em discrepant learning rates} provably recovers the underlying matrix even without prior knowledge on neither rank of the matrix nor sparsity of the corruption. We further extend our approach for the robust recovery of natural images by over-parameterizing images with deep convolutional networks. Experiments show that our method handles different test images and varying corruption levels with a single learning pipeline where the network width and termination conditions do not need to be adjusted on a case-by-case basis. Underlying the success is again the implicit bias with discrepant learning rates on different over-parameterized parameters, which may bear on broader applications.


Review for NeurIPS paper: Robust Recovery via Implicit Bias of Discrepant Learning Rates for Double Over-parameterization

Neural Information Processing Systems

Additional Feedback: To be honest, I find the term "double overparametrization" a bit strange. I would still call it simply "overparametrization". Perhaps, the authors could think about this point and potentially adjust. I would suggest that the authors briefly discuss the following point which is sometimes overlooked when discussing implicit bias of gradient descent in the context of low rank matrix recovery. When additional restricting to positive semidefinite matrices it turns out that the original low rank matrix is often the UNIQUE solution to the linear equation y A(X) that is positive semidefinite, see the paper "Implicit regularization and solution uniqueness in over-parameterized matrix sensing" by Geyer et al., arxiv:806.02046,


Review for NeurIPS paper: Robust Recovery via Implicit Bias of Discrepant Learning Rates for Double Over-parameterization

Neural Information Processing Systems

The paper has been discussed after the rebuttal that the reviewers found useful and actionable (e.g., clarification about the novelty of the proof and about the experiments). The paper is recommended for acceptance. All reviewers have acknowledged that the paper makes a step towards better understanding over-parameterization and the implicit bias of gradient descent. As promised in the rebuttal, it is important to include in the final version of the paper the mentioned clarifications and discussions.


Rank Overspecified Robust Matrix Recovery: Subgradient Method and Exact Recovery

Neural Information Processing Systems

We study the robust recovery of a low-rank matrix from sparsely and grossly corrupted Gaussian measurements, with no prior knowledge on the intrinsic rank. We consider the robust matrix factorization approach. We employ a robust \ell_1 loss function and deal with the challenge of the unknown rank by using an overspecified factored representation of the matrix variable. We then solve the associated nonconvex nonsmooth problem using a subgradient method with diminishing stepsizes. We show that under a regularity condition on the sensing matrices and corruption, which we call restricted direction preserving property (RDPP), even with rank overspecified, the subgradient method converges to the exact low-rank solution at a sublinear rate. Moreover, our result is more general in the sense that it automatically speeds up to a linear rate once the factor rank matches the unknown rank.


Robust Recovery via Implicit Bias of Discrepant Learning Rates for Double Over-parameterization

Neural Information Processing Systems

Recent advances have shown that implicit bias of gradient descent on over-parameterized models enables the recovery of low-rank matrices from linear measurements, even with no prior knowledge on the intrinsic rank. In contrast, for {\em robust} low-rank matrix recovery from {\em grossly corrupted} measurements, over-parameterization leads to overfitting without prior knowledge on both the intrinsic rank and sparsity of corruption. This paper shows that with a {\em double over-parameterization} for both the low-rank matrix and sparse corruption, gradient descent with {\em discrepant learning rates} provably recovers the underlying matrix even without prior knowledge on neither rank of the matrix nor sparsity of the corruption. We further extend our approach for the robust recovery of natural images by over-parameterizing images with deep convolutional networks. Experiments show that our method handles different test images and varying corruption levels with a single learning pipeline where the network width and termination conditions do not need to be adjusted on a case-by-case basis. Underlying the success is again the implicit bias with discrepant learning rates on different over-parameterized parameters, which may bear on broader applications.


Robust recovery of multiple subspaces by geometric l_p minimization

Lerman, Gilad, Zhang, Teng

arXiv.org Machine Learning

We assume i.i.d. data sampled from a mixture distribution with K components along fixed d-dimensional linear subspaces and an additional outlier component. For p>0, we study the simultaneous recovery of the K fixed subspaces by minimizing the l_p-averaged distances of the sampled data points from any K subspaces. Under some conditions, we show that if $01 and p>1, then the underlying subspaces cannot be recovered or even nearly recovered by l_p minimization. The results of this paper partially explain the successes and failures of the basic approach of l_p energy minimization for modeling data by multiple subspaces.