Training on the Edge of Stability Is Caused by Layerwise Jacobian Alignment

Open in new window