Goto

Collaborating Authors

 new regularization method


GradAug: A New Regularization Method for Deep Neural Networks

Neural Information Processing Systems

We propose a new regularization method to alleviate over-fitting in deep neural networks. The key idea is utilizing randomly transformed training samples to regularize a set of sub-networks, which are originated by sampling the width of the original network, in the training process. As such, the proposed method introduces self-guided disturbances to the raw gradients of the network and therefore is termed as Gradient Augmentation (GradAug). We demonstrate that GradAug can help the network learn well-generalized and more diverse representations. Moreover, it is easy to implement and can be applied to various structures and applications. GradAug improves ResNet-50 to 78.79% on ImageNet classification, which is a new state-of-the-art accuracy. By combining with CutMix, it further boosts the performance to 79.67%, which outperforms an ensemble of advanced training tricks. The generalization ability is evaluated on COCO object detection and instance segmentation where GradAug significantly surpasses other state-of-the-art methods. GradAug is also robust to image distortions and FGSM adversarial attacks and is highly effective in low data regimes.



Review for NeurIPS paper: GradAug: A New Regularization Method for Deep Neural Networks

Neural Information Processing Systems

Summary and Contributions: After rebuttal and discussion with other reviewers I have updated my score. However, I do point out several concerns of mine which the authors could consider further validation for: It's good that the authors performed the time/memory comparison in the rebuttal as that was a significant concern of mine. My concerns mostly revolve around what other techniques should we compare this against? Given that this algorithm takes 3-4x the time with comparison to the baseline, I could for example: 1: Train a much larger network and then use compression techniques to slim it to the same size. Mixup which is still 70% faster.



GradAug: A New Regularization Method for Deep Neural Networks

Neural Information Processing Systems

We propose a new regularization method to alleviate over-fitting in deep neural networks. The key idea is utilizing randomly transformed training samples to regularize a set of sub-networks, which are originated by sampling the width of the original network, in the training process. As such, the proposed method introduces self-guided disturbances to the raw gradients of the network and therefore is termed as Gradient Augmentation (GradAug). We demonstrate that GradAug can help the network learn well-generalized and more diverse representations. Moreover, it is easy to implement and can be applied to various structures and applications. GradAug improves ResNet-50 to 78.79% on ImageNet classification, which is a new state-of-the-art accuracy.


[R] NeurIPS-2020 paper: GradAug: A New Regularization Method for Deep Neural Networks

#artificialintelligence

We propose a new regularization method to alleviate over-fitting in deep neural networks. The key idea is utilizing randomly transformed training samples to regularize a set of sub-networks, which are originated by sampling the width of the original network, in the training process. As such, the proposed method introduces self-guided disturbances to the raw gradients of the network and therefore is termed as Gradient Augmentation (GradAug). We demonstrate that GradAug can help the network learn well-generalized and more diverse representations. Moreover, it is easy to implement and can be applied to various structures and applications. GradAug improves ResNet-50 to 78.79% on ImageNet classification, which is a new state-of-the-art accuracy.