Heavy Tails in SGD and Compressibility of Overparametrized Neural Networks

Open in new window