AITopics | implicit regularization

Implicit Regularization of Decentralized Gradient Descent for Decentralized Sparse Regression

Neural Information Processing SystemsMay-28-2025, 16:52:44 GMT

We consider learning a sparse model from linear measurements taken by a network of agents.

artificial intelligence, inequality, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry: Information Technology > Security & Privacy (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Communications > Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.64)

Add feedback

Implicit Regularization in Matrix Factorization

Neural Information Processing SystemsMay-26-2025, 13:53:00 GMT

We study implicit regularization when optimizing an underdetermined quadratic objective over a matrix X with gradient descent on a factorization of X. We conjecture and provide empirical and theoretical evidence that with small enough step sizes and initialization close enough to the origin, gradient descent on a full dimensional factorization converges to the minimum nuclear norm solution.

artificial intelligence, implicit regularization, machine learning, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.91)

Add feedback

7db2348b5bfeca620aa7327df815adcc-Paper-Conference.pdf

Neural Information Processing SystemsMay-25-2025, 02:42:14 GMT

artificial intelligence, machine learning, tensor, (15 more...)

Neural Information Processing Systems

Country: Europe (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Implicit Regularization in Over-Parameterized Support Vector Machine Xin He

Neural Information Processing SystemsMay-24-2025, 05:41:20 GMT

In this paper, we design a regularization-free algorithm for high-dimensional support vector machines (SVMs) by integrating over-parameterization with Nesterov's smoothing method, and provide theoretical guarantees for the induced implicit regularization phenomenon. In particular, we construct an over-parameterized hinge loss function and estimate the true parameters by leveraging regularization-free gradient descent on this loss function. The utilization of Nesterov's method enhances the computational efficiency of our algorithm, especially in terms of determining the stopping criterion and reducing computational complexity. With appropriate choices of initialization, step size, and smoothness parameter, we demonstrate that unregularized gradient descent achieves a near-oracle statistical convergence rate. Additionally, we verify our theoretical findings through a variety of numerical experiments and compare the proposed method with explicit regularization. Our results illustrate the advantages of employing implicit regularization via gradient descent in conjunction with over-parameterization in sparse SVMs.

artificial intelligence, machine learning, regularization, (17 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.87)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.91)

Add feedback

Implicit Regularization in Deep Learning May Not Be Explainable by Norms

Neural Information Processing SystemsMay-23-2025, 02:30:24 GMT

Mathematically characterizing the implicit regularization induced by gradientbased optimization is a longstanding pursuit in the theory of deep learning. A widespread hope is that a characterization based on minimization of norms may apply, and a standard test-bed for studying this prospect is matrix factorization (matrix completion via linear neural networks). It is an open question whether norms can explain the implicit regularization in matrix factorization. The current paper resolves this open question in the negative, by proving that there exist natural matrix factorization problems on which the implicit regularization drives all norms (and quasi-norms) towards infinity. Our results suggest that, rather than perceiving the implicit regularization via norms, a potentially more useful interpretation is minimization of rank. We demonstrate empirically that this interpretation extends to a certain class of non-linear neural networks, and hypothesize that it may be key to explaining generalization in deep learning.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Implicit Regularization in Deep Learning May Not Be Explainable by Norms

Neural Information Processing SystemsMay-23-2025, 02:30:16 GMT

Mathematically characterizing the implicit regularization induced by gradientbased optimization is a longstanding pursuit in the theory of deep learning. A widespread hope is that a characterization based on minimization of norms may apply, and a standard test-bed for studying this prospect is matrix factorization (matrix completion via linear neural networks). It is an open question whether norms can explain the implicit regularization in matrix factorization. The current paper resolves this open question in the negative, by proving that there exist natural matrix factorization problems on which the implicit regularization drives all norms (and quasi-norms) towards infinity. Our results suggest that, rather than perceiving the implicit regularization via norms, a potentially more useful interpretation is minimization of rank. We demonstrate empirically that this interpretation extends to a certain class of non-linear neural networks, and hypothesize that it may be key to explaining generalization in deep learning.

artificial intelligence, deep learning, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

f21e255f89e0f258accbe4e984eef486-AuthorFeedback.pdf

Neural Information Processing SystemsMay-23-2025, 02:30:03 GMT

We thank reviewers for their time and effort! Miscellaneous () Thank you for the positive feedback! Miscellaneous () Thank you for the feedback and support! By this they refute the prospect of norms being implicitly minimized on every convex objective. To our knowledge, very few have endorsed this far-reaching prospect.

artificial intelligence, machine learning, matrix factorization, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

ab3f6bbe121a8f7a0263a9b393000741-Paper-Conference.pdf

Neural Information Processing SystemsMar-27-2025, 11:31:42 GMT

artificial intelligence, machine learning, optimization problem, (14 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Implicit Regularization of Discrete Gradient Dynamics in Linear Neural Networks

Neural Information Processing SystemsMar-27-2025, 04:12:35 GMT

When optimizing over-parameterized models, such as deep neural networks, a large set of parameters can achieve zero training error. In such cases, the choice of the optimization algorithm and its respective hyper-parameters introduces biases that will lead to convergence to specific minimizers of the objective. Consequently, this choice can be considered as an implicit regularization for the training of over-parametrized models. In this work, we push this idea further by studying the discrete gradient dynamics of the training of a two-layer linear network with the least-squares loss. Using a time rescaling, we show that, with a vanishing initialization and a small enough step size, this dynamics sequentially learns the solutions of a reduced-rank regression with a gradually increasing rank.

artificial intelligence, assumption, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > Canada (0.46)

Technology: