Simulated Annealing in Early Layers Leads to Better Generalization

Open in new window