regmixup
5ddcfaad1cb72ce6f1a365e8f1ecf791-Supplemental-Conference.pdf
Additionally, we provide the calibration performance of various competitive approaches. Briefly, calibration quantifies how similar a model's confidence and its accuracy are [Osborne, 1991]). To measure it, we employ the recently proposed Adaptive ECE (AdaECE) [Mukhoti et al., 2020]. For all the methods, the AdaECE is computed after performing temperature scaling [Guoetal.,2017] Unfortunately, we could not manage to make their code work on C100 as the training procedure seemed to be unstable.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
Infinite Class Mixup
Mensink, Thomas, Mettes, Pascal
Mixup is a widely adopted strategy for training deep networks, where additional samples are augmented by interpolating inputs and labels of training pairs. Mixup has shown to improve classification performance, network calibration, and out-of-distribution generalisation. While effective, a cornerstone of Mixup, namely that networks learn linear behaviour patterns between classes, is only indirectly enforced since the output interpolation is performed at the probability level. This paper seeks to address this limitation by mixing the classifiers directly instead of mixing the labels for each mixed pair. We propose to define the target of each augmented sample as a uniquely new classifier, whose parameters are a linear interpolation of the classifier vectors of the input pair. The space of all possible classifiers is continuous and spans all interpolations between classifier pairs. To make optimisation tractable, we propose a dual-contrastive Infinite Class Mixup loss, where we contrast the classifier of a mixed pair to both the classifiers and the predicted outputs of other mixed pairs in a batch. Infinite Class Mixup is generic in nature and applies to many variants of Mixup. Empirically, we show that it outperforms standard Mixup and variants such as RegMixup and Remix on balanced, long-tailed, and data-constrained benchmarks, highlighting its broad applicability.
RegMixup: Mixup as a Regularizer Can Surprisingly Improve Accuracy and Out Distribution Robustness
Pinto, Francesco, Yang, Harry, Lim, Ser-Nam, Torr, Philip H. S., Dokania, Puneet K.
We show that the effectiveness of the well celebrated Mixup [Zhang et al., 2018] can be further improved if instead of using it as the sole learning objective, it is utilized as an additional regularizer to the standard cross-entropy loss. This simple change not only provides much improved accuracy but also significantly improves the quality of the predictive uncertainty estimation of Mixup in most cases under various forms of covariate shifts and out-of-distribution detection experiments. In fact, we observe that Mixup yields much degraded performance on detecting out-of-distribution samples possibly, as we show empirically, because of its tendency to learn models that exhibit high-entropy throughout; making it difficult to differentiate in-distribution samples from out-distribution ones. To show the efficacy of our approach (RegMixup), we provide thorough analyses and experiments on vision datasets (ImageNet & CIFAR-10/100) and compare it with a suite of recent approaches for reliable uncertainty estimation.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Artificial Intelligence > Vision (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)