Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training

Open in new window