Saddle-to-Saddle Dynamics in Diagonal Linear Networks
–Neural Information Processing Systems
In this paper we fully describe the trajectory of gradient flow over 2-layer diagonal linear networks for the regression setting in the limit of vanishing initialisation.
Neural Information Processing Systems
May-28-2025, 12:48:33 GMT