Goto

Collaborating Authors

 Tack, Jihoon


Consistency Regularization for Adversarial Robustness

arXiv.org Artificial Intelligence

Adversarial training (AT) is currently one of the most successful methods to obtain the adversarial robustness of deep neural networks. However, a significant generalization gap in the robustness obtained from AT has been problematic, making practitioners to consider a bag of tricks for a successful training, e.g., early stopping. In this paper, we investigate data augmentation (DA) techniques to address the issue. In contrast to the previous reports in the literature that DA is not effective for regularizing AT, we discover that DA can mitigate overfitting in AT surprisingly well, but they should be chosen deliberately. To utilize the effect of DA further, we propose a simple yet effective auxiliary 'consistency' regularization loss to optimize, which forces predictive distributions after attacking from two different augmentations to be similar to each other. Our experimental results demonstrate that our simple regularization scheme is applicable for a wide range of AT methods, showing consistent yet significant improvements in the test robust accuracy. More remarkably, we also show that our method could significantly help the model to generalize its robustness against unseen adversaries, e.g., other types or larger perturbations compared to those used during training. Code is available at https://github.com/alinlab/consistency-adversarial.


Adversarial Self-Supervised Contrastive Learning

arXiv.org Machine Learning

Existing adversarial learning approaches mostly use class labels to generate adversarial samples that lead to incorrect predictions, which are then used to augment the training of the model for improved robustness. While some recent works propose semi-supervised adversarial learning methods that utilize unlabeled data, they still require class labels. However, do we really need class labels at all, for adversarially robust training of deep neural networks? In this paper, we propose a novel adversarial attack for unlabeled data, which makes the model confuse the instance-level identities of the perturbed data samples. Further, we present a self-supervised contrastive learning framework to adversarially train a robust neural network without labeled data, which aims to maximize the similarity between a random augmentation of a data sample and its instance-wise adversarial perturbation. We validate our method, Robust Contrastive Learning (RoCL), on multiple benchmark datasets, on which it obtains comparable robust accuracy over state-of-the-art supervised adversarial learning methods, and significantly improved robustness against the black box and unseen types of attacks. Moreover, with further joint fine-tuning with supervised adversarial loss, RoCL obtains even higher robust accuracy over using self-supervised learning alone. Notably, RoCL also demonstrate impressive results in robust transfer learning.


CSI: Novelty Detection via Contrastive Learning on Distributionally Shifted Instances

arXiv.org Machine Learning

Novelty detection, i.e., identifying whether a given sample is drawn from outside the training distribution, is essential for reliable machine learning. To this end, there have been many attempts at learning a representation well-suited for novelty detection and designing a score based on such representation. In this paper, we propose a simple, yet effective method named contrasting shifted instances (CSI), inspired by the recent success on contrastive learning of visual representations. Specifically, in addition to contrasting a given sample with other instances as in conventional contrastive learning methods, our training scheme contrasts the sample with distributionally-shifted augmentations of itself. Based on this, we propose a new detection score that is specific to the proposed training scheme. Our experiments demonstrate the superiority of our method under various novelty detection scenarios, including unlabeled one-class, unlabeled multi-class and labeled multi-class settings, with various image benchmark datasets. Code and pre-trained models are available at https://github.com/alinlab/CSI.