Drop-Activation: Implicit Parameter Reduction and Harmonic Regularization
Liang, Senwei, Kwoo, Yuehaw, Yang, Haizhao
Overfitting frequently occurs in deep learning. In this paper, we propose a novel regularization method called Drop-Activation to reduce overfitting and improve generalization. The key idea is to \emph{drop} nonlinear activation functions by setting them to be identity functions randomly during training time. During testing, we use a deterministic network with a new activation function to encode the average effect of dropping activations randomly. Experimental results on CIFAR-10, CIFAR-100, SVHN, and EMNIST show that Drop-Activation generally improves the performance of popular neural network architectures. Furthermore, unlike dropout, as a regularizer Drop-Activation can be used in harmony with standard training and regularization techniques such as Batch Normalization and AutoAug. Our theoretical analyses support the regularization effect of Drop-Activation as implicit parameter reduction and its capability to be used together with Batch Normalization.
Nov-14-2018
- Country:
- North America > United States > California > Santa Clara County (0.14)
- Genre:
- Research Report (0.82)
- Technology: