A Minimalist Example of Edge-of-Stability and Progressive Sharpening

Neural Information Processing Systems 

Recent advances in deep learning optimization have unveiled two intriguing phenomena under large learning rates: Edge of Stability (EoS) and Progressive Sharpening (PS), challenging classical Gradient Descent (GD) analyses.