Goto

Collaborating Authors

 Volansky, Tomer


Noise Injection Node Regularization for Robust Learning

arXiv.org Artificial Intelligence

We introduce Noise Injection Node Regularization (NINR), a method of injecting structured noise into Deep Neural Networks (DNN) during the training stage, resulting in an emergent regularizing effect. We present theoretical and empirical evidence for substantial improvement in robustness against various test data perturbations for feed-forward DNNs when trained under NINR. The novelty in our approach comes from the interplay of adaptive noise injection and initialization conditions such that noise is the dominant driver of dynamics at the start of training. As it simply requires the addition of external nodes without altering the existing network structure or optimization algorithms, this method can be easily incorporated into many standard problem specifications. We find improved stability against a number of data perturbations, including domain shifts, with the most dramatic improvement obtained for unstructured noise, where our technique outperforms other existing methods such as Dropout or $L_2$ regularization, in some cases. We further show that desirable generalization properties on clean data are generally maintained.


Noise Injection as a Probe of Deep Learning Dynamics

arXiv.org Artificial Intelligence

Deep learning has proven exceedingly successful, leading to dramatic improvements in multiple domains. Nevertheless, our current theoretical understanding of deep learning methods has remained unsatisfactory. Specifically, the training of DNNs is a highly opaque procedure, with few metrics, beyond curvature evolution [1-7], available to describe how a network evolves as it trains. An interesting attempt at parameterizing the interplay between training dynamics and generalization was explored in the seminal work of Ref. [8], which demonstrated that when input data was corrupted by adding random noise, the generalization error deteriorated in correlation with its strength. Noise injection has gained further traction in recent years, both as a means of effective regularization [9-18], as well as a route towards understanding DNN dynamics and generalization. For instance, label noise has been shown to affect the implicit bias of Stochastic Gradient Descent (SGD) [19-23], as sparse solutions appear to be preferred over those which reduce the Euclidean norm, in certain cases. In this work, we take another step along this direction, by allowing the network to actively regulate the effects of the injected noise during training. Concretely, we define Noise Injection Nodes (NINs), whose output is a random variable, chosen sample-wise from a given distribution.