Noise Injection as a Probe of Deep Learning Dynamics
Levi, Noam, Bloch, Itay, Freytsis, Marat, Volansky, Tomer
–arXiv.org Artificial Intelligence
Deep learning has proven exceedingly successful, leading to dramatic improvements in multiple domains. Nevertheless, our current theoretical understanding of deep learning methods has remained unsatisfactory. Specifically, the training of DNNs is a highly opaque procedure, with few metrics, beyond curvature evolution [1-7], available to describe how a network evolves as it trains. An interesting attempt at parameterizing the interplay between training dynamics and generalization was explored in the seminal work of Ref. [8], which demonstrated that when input data was corrupted by adding random noise, the generalization error deteriorated in correlation with its strength. Noise injection has gained further traction in recent years, both as a means of effective regularization [9-18], as well as a route towards understanding DNN dynamics and generalization. For instance, label noise has been shown to affect the implicit bias of Stochastic Gradient Descent (SGD) [19-23], as sparse solutions appear to be preferred over those which reduce the Euclidean norm, in certain cases. In this work, we take another step along this direction, by allowing the network to actively regulate the effects of the injected noise during training. Concretely, we define Noise Injection Nodes (NINs), whose output is a random variable, chosen sample-wise from a given distribution.
arXiv.org Artificial Intelligence
Oct-24-2022