Wide-minima Density Hypothesis and the Explore-Exploit Learning Rate Schedule

Open in new window