Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent Zhiyuan Li
–Neural Information Processing Systems
As part of the effort to understand implicit bias of gradient descent in over-parametrized models, several results have shown how the training trajectory on the overparametrized model can be understood as mirror descent on a different objective. The main result here is a characterization of this phenomenon under a notion termed commuting parametrization, which encompasses all the previous results in this setting.
Neural Information Processing Systems
Aug-19-2025, 12:33:22 GMT
- Country:
- Asia
- Europe > Russia (0.04)
- North America > United States
- Connecticut > New Haven County
- New Haven (0.04)
- New Jersey > Mercer County
- Princeton (0.04)
- Connecticut > New Haven County
- Genre:
- Research Report (0.67)
- Technology: