Benign Oscillation of Stochastic Gradient Descent with Large Learning Rates

Open in new window