Saddle-to-Saddle Dynamics in Diagonal Linear Networks
–Neural Information Processing Systems
In this paper we fully describe the trajectory of gradient flow over 2-layer diagonal linear networks for the regression setting in the limit of vanishing initialisation.
Neural Information Processing Systems
Mar-19-2025, 03:14:34 GMT