Chaotic Regularization and Heavy-Tailed Limits for Deterministic Gradient Descent

Dec-24-2025, 23:11:49 GMT–Neural Information Processing Systems

Recent studies have shown that gradient descent (GD) can achieve improved generalization when its dynamics exhibits a chaotic behavior. However, to obtain the desired effect, the step-size should be chosen sufficiently large, a task which is problem dependent and can be difficult in practice. In this study, we incorporate a chaotic component to GD in a controlled manner, and introduce \emph{multiscale perturbed GD} (MPGD), a novel optimization framework where the GD recursion is augmented with chaotic perturbations that evolve via an independent dynamical system. We analyze MPGD from three different angles: (i) By building up on recent advances in rough paths theory, we show that, under appropriate assumptions, as the step-size decreases, the MPGD recursion converges weakly to a stochastic differential equation (SDE) driven by a heavy-tailed L\'{e}vy-stable process.

chaotic regularization and heavy-tailed limit, deterministic gradient descent, name change, (3 more...)

Neural Information Processing Systems

Dec-24-2025, 23:11:49 GMT

Conferences Web Page

Add feedback

Genre:
- Research Report (0.59)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.42)