Improving Deep Learning Optimization through Constrained Parameter Regularization

Mar-18-2026, 05:20:43 GMT–Neural Information Processing Systems

Regularization is a critical component in deep learning. The most commonly used approach, weight decay, applies a constant penalty coefficient uniformly across all parameters. This may be overly restrictive for some parameters, while insufficient for others. To address this, we present Constrained Parameter Regularization (CPR) as an alternative to traditional weight decay. Unlike the uniform application of a single penalty, CPR enforces an upper bound on a statistical measure, such as the L$_2$-norm, of individual parameter matrices. Consequently, learning becomes a constraint optimization problem, which we tackle using an adaptation of the augmented Lagrangian method.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Mar-18-2026, 05:20:43 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (0.60)
  - Machine Learning > Neural Networks
    - Deep Learning (0.31)