Entropic gradient descent algorithms and wide flat minima

Open in new window