Understanding Gradient Descent on Edge of Stability in Deep Learning

Open in new window