autoattack
MALT Powers Up Adversarial Attacks
Current adversarial attacks for multi-class classifiers choose potential adversarial target classes naively based on the classifier's confidence levels. We present a novel adversarial targeting method, \textit{MALT - Mesoscopic Almost Linearity Targeting}, based on local almost linearity assumptions. Our attack wins over the current state of the art AutoAttack on the standard benchmark datasets CIFAR-100 and Imagenet and for different robust models. In particular, our attack uses a \emph{five times faster} attack strategy than AutoAttack's while successfully matching AutoAttack's successes and attacking additional samples that were previously out of reach. We additionally prove formally and demonstrate empirically that our targeting method, although inspired by linear predictors, also applies to non-linear models.
- Information Technology > Security & Privacy (1.00)
- Government > Military (1.00)
- Asia > Middle East > Israel (0.04)
- Europe > Slovakia > Bratislava > Bratislava (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.67)
- North America > Canada > British Columbia > Vancouver (0.05)
- Europe > Switzerland (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- (10 more...)
Supplement
In this section, we give an overview of related work in stable neural ODE networks. We also give an overview of common adversarial attacks and recent works that defend against adversarial examples. Stable Neural Network Gradient vanishing and gradient exploding are two well-known phenomena in deep learning [1]. The gradient of the objective function, which strongly relies on the training method as well as the neural network architecture, indicates how sensitive the output is with respect to (w.r.t.) input perturbation. Exploding gradient implies instability of the output w.r.t. the input and thus resulting in a non-robust learning architecture.
- Asia > Singapore (0.05)
- South America > Peru > Cusco Department (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Security & Privacy (0.51)
- Government > Military (0.37)
- Asia > Singapore (0.05)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Israel (0.04)
- Europe > Slovakia > Bratislava > Bratislava (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.67)
contributions and relation to prior work
We thank the reviewers for their helpful comments. Below, we address some of the points made regarding our work's On automated attacks (Reviewer 1 and 3). Reviewer 1 and Reviewer 3 argue that "AutoAttack" (Croce & Hein, "k-winners take all" defense (19% accuracy), whereas we reduce it to 0% accuracy. Adversarial Training" and "Are Generative Classifiers More Robust"). Of the 13 defenses we study, 5 aim at detecting adversarial examples. AutoAttack also cannot be directly applied to "Temporal Dependency" (a speech-to-text model) and "Robust Sparse Fourier Transform" (which is aimed at perturbations of small null We believe AutoAttack is a strong, non-adaptive baseline. The above points illustrate why. We apologize for not clarifying this in the paper. We still view these as white-box attacks. On related work & technical novelty (Reviewer 3). We view the fact that "defenses are broken by existing tech-32 This is what differentiates our work from prior work that proposed and argued for adaptive attacks (e.g., Carlini &
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- (11 more...)
Supplement
In this section, we give an overview of related work in stable neural ODE networks. We also give an overview of common adversarial attacks and recent works that defend against adversarial examples. Stable Neural Network Gradient vanishing and gradient exploding are two well-known phenomena in deep learning [1]. The gradient of the objective function, which strongly relies on the training method as well as the neural network architecture, indicates how sensitive the output is with respect to (w.r.t.) input perturbation. Exploding gradient implies instability of the output w.r.t. the input and thus resulting in a non-robust learning architecture.
- Asia > Singapore (0.05)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Security & Privacy (0.51)
- Government > Military (0.37)
- North America > Canada > Ontario > Toronto (0.14)
- Asia > Singapore (0.05)
- North America > United States > New York (0.04)
- (2 more...)