AITopics | discrepant learning rate

Robust Recovery via Implicit Bias of Discrepant Learning Rates for Double Over-parameterization

Neural Information Processing SystemsDec-24-2025, 15:40:43 GMT

Recent advances have shown that implicit bias of gradient descent on over-parameterized models enables the recovery of low-rank matrices from linear measurements, even with no prior knowledge on the intrinsic rank. In contrast, for {\em robust} low-rank matrix recovery from {\em grossly corrupted} measurements, over-parameterization leads to overfitting without prior knowledge on both the intrinsic rank and sparsity of corruption. This paper shows that with a {\em double over-parameterization} for both the low-rank matrix and sparse corruption, gradient descent with {\em discrepant learning rates} provably recovers the underlying matrix even without prior knowledge on neither rank of the matrix nor sparsity of the corruption. We further extend our approach for the robust recovery of natural images by over-parameterizing images with deep convolutional networks. Experiments show that our method handles different test images and varying corruption levels with a single learning pipeline where the network width and termination conditions do not need to be adjusted on a case-by-case basis. Underlying the success is again the implicit bias with discrepant learning rates on different over-parameterized parameters, which may bear on broader applications.

discrepant learning rate, implicit bias, robust recovery, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Review for NeurIPS paper: Robust Recovery via Implicit Bias of Discrepant Learning Rates for Double Over-parameterization

Neural Information Processing SystemsFeb-6-2025, 09:27:12 GMT

Additional Feedback: To be honest, I find the term "double overparametrization" a bit strange. I would still call it simply "overparametrization". Perhaps, the authors could think about this point and potentially adjust. I would suggest that the authors briefly discuss the following point which is sometimes overlooked when discussing implicit bias of gradient descent in the context of low rank matrix recovery. When additional restricting to positive semidefinite matrices it turns out that the original low rank matrix is often the UNIQUE solution to the linear equation y A(X) that is positive semidefinite, see the paper "Implicit regularization and solution uniqueness in over-parameterized matrix sensing" by Geyer et al., arxiv:806.02046,

discrepant learning rate, double over-parameterization, robust recovery, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.77)

Add feedback

Review for NeurIPS paper: Robust Recovery via Implicit Bias of Discrepant Learning Rates for Double Over-parameterization

Neural Information Processing SystemsFeb-6-2025, 09:27:12 GMT

The paper has been discussed after the rebuttal that the reviewers found useful and actionable (e.g., clarification about the novelty of the proof and about the experiments). The paper is recommended for acceptance. All reviewers have acknowledged that the paper makes a step towards better understanding over-parameterization and the implicit bias of gradient descent. As promised in the rebuttal, it is important to include in the final version of the paper the mentioned clarifications and discussions.

discrepant learning rate, double over-parameterization, robust recovery, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.82)

Add feedback

Robust Recovery via Implicit Bias of Discrepant Learning Rates for Double Over-parameterization

Neural Information Processing SystemsOct-11-2024, 09:38:31 GMT

Recent advances have shown that implicit bias of gradient descent on over-parameterized models enables the recovery of low-rank matrices from linear measurements, even with no prior knowledge on the intrinsic rank. In contrast, for {\em robust} low-rank matrix recovery from {\em grossly corrupted} measurements, over-parameterization leads to overfitting without prior knowledge on both the intrinsic rank and sparsity of corruption. This paper shows that with a {\em double over-parameterization} for both the low-rank matrix and sparse corruption, gradient descent with {\em discrepant learning rates} provably recovers the underlying matrix even without prior knowledge on neither rank of the matrix nor sparsity of the corruption. We further extend our approach for the robust recovery of natural images by over-parameterizing images with deep convolutional networks. Experiments show that our method handles different test images and varying corruption levels with a single learning pipeline where the network width and termination conditions do not need to be adjusted on a case-by-case basis. Underlying the success is again the implicit bias with discrepant learning rates on different over-parameterized parameters, which may bear on broader applications.

discrepant learning rate, double over-parameterization, robust recovery, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Filters

Collaborating Authors

discrepant learning rate

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Robust Recovery via Implicit Bias of Discrepant Learning Rates for Double Over-parameterization

Review for NeurIPS paper: Robust Recovery via Implicit Bias of Discrepant Learning Rates for Double Over-parameterization

Review for NeurIPS paper: Robust Recovery via Implicit Bias of Discrepant Learning Rates for Double Over-parameterization

Robust Recovery via Implicit Bias of Discrepant Learning Rates for Double Over-parameterization