Abide by the law and follow the flow: conservation laws for gradient flows

Dec-26-2025, 18:10:29 GMT–Neural Information Processing Systems

Understanding the geometric properties of gradient descent dynamics is a key ingredient in deciphering the recent success of very large machine learning models. A striking observation is that trained over-parameterized models retain some properties of the optimization initialization. This implicit bias is believed to be responsible for some favorable properties of the trained models and could explain their good generalization properties. The purpose of this article is threefold. First, we rigorously expose the definition and basic properties of conservation laws, that define quantities conserved during gradient flows of a given model (e.g. of a ReLU network with a given architecture) with any training data and any loss.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Dec-26-2025, 18:10:29 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)