The alignment property of SGD noise and how it helps select flat minima: A stability analysis

Open in new window