Parameter Symmetry and Noise Equilibrium of Stochastic Gradient Descent Mingze Wang Massachusetts Institute of Technology, Peking University NTT Research

Mar-26-2025, 18:49:21 GMT–Neural Information Processing Systems

Symmetries are prevalent in deep learning and can significantly influence the learning dynamics of neural networks. In this paper, we examine how exponential symmetries - a broad subclass of continuous symmetries present in the model architecture or loss function - interplay with stochastic gradient descent (SGD). We first prove that gradient noise creates a systematic motion (a "Noether flow") of the parameters θ along the degenerate direction to a unique initializationindependent fixed point θ

artificial intelligence, machine learning, symmetry, (15 more...)

Neural Information Processing Systems

Mar-26-2025, 18:49:21 GMT

Conferences PDF

Add feedback

Country:
- North America > United States > Massachusetts (0.50)

Genre:
- Research Report > Experimental Study (0.93)

Industry:
- Information Technology > Services (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks (1.00)
    - Statistical Learning > Gradient Descent (1.00)
  - Representation & Reasoning > Mathematical & Statistical Methods (0.84)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found