Precise Dynamics of Diagonal Linear Networks: A Unifying Analysis by Dynamical Mean-Field Theory

Oct-3-2025–arXiv.org Machine Learning

The training dynamics of neural networks have attracted significant attention in deep learning theory. It has been suggested that the dynamics induced by training algorithms strongly influence the generalization performance of neural networks. This effect is captured in the idea of implicit bias (Neyshabur et al., 2014), in which the algorithm selects a certain solution among many induced by nonconvexity of the loss and overparametrization of networks. Accordingly, many recent works have studied the interplay between models and optimizers, aiming to characterize the resulting implicit biases (Neyshabur, 2017; Soudry et al., 2018; Arora et al., 2019; Bartlett et al., 2021). Moreover, understanding the convergence speed and timescales of the training dynamics contributes to efficient training of high-performance models in practice, especially in the context of modern large-scale neural networks in which the training is stopped at a compute-optimal point (Kaplan et al., 2020).

dmft equation, equation, gradient flow, (13 more...)

arXiv.org Machine Learning

Oct-3-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.14)
- Asia > Japan
  - Honshū
    - Kantō > Tokyo Metropolis Prefecture
      - Tokyo (0.04)
    - Kansai > Kyoto Prefecture
      - Kyoto (0.04)
- Africa > Middle East
  - Tunisia > Ben Arous Governorate > Ben Arous (0.04)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found