Dissecting the Effects of SGD Noise in Distinct Regimes of Deep Learning

Open in new window