Neuron with Steady Response Leads to Better Generalization

Neural Information Processing Systems 

Because the deep learning models for the classification task always have a normalization operation (e.g., Softmax) to make the final unconstrained These authors contributed equally to the work. Work performed during the internship at MSRA. 36th Conference on Neural Information Processing Systems (NeurIPS 2022). Complexity Measure C will be a positive number in those local minima. Measure C will be 0. The Lemma is proven. C.1 Open Source Code We publish our code in Github (i.e., https://github.