autoattack
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)
- Europe > Switzerland (0.05)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- (8 more...)
- Asia > Middle East > Israel (0.04)
- Europe > Slovakia > Bratislava > Bratislava (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.67)
- Asia > Singapore (0.05)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
MALT Powers Up Adversarial Attacks
Current adversarial attacks for multi-class classifiers choose potential adversarial target classes naively based on the classifier's confidence levels. We present a novel adversarial targeting method, \textit{MALT - Mesoscopic Almost Linearity Targeting}, based on local almost linearity assumptions. Our attack wins over the current state of the art AutoAttack on the standard benchmark datasets CIFAR-100 and Imagenet and for different robust models. In particular, our attack uses a \emph{five times faster} attack strategy than AutoAttack's while successfully matching AutoAttack's successes and attacking additional samples that were previously out of reach. We additionally prove formally and demonstrate empirically that our targeting method, although inspired by linear predictors, also applies to non-linear models.
- Information Technology > Security & Privacy (1.00)
- Government > Military (1.00)
- Asia > Middle East > Israel (0.04)
- Europe > Slovakia > Bratislava > Bratislava (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.67)
contributions and relation to prior work
We thank the reviewers for their helpful comments. Below, we address some of the points made regarding our work's On automated attacks (Reviewer 1 and 3). Reviewer 1 and Reviewer 3 argue that "AutoAttack" (Croce & Hein, "k-winners take all" defense (19% accuracy), whereas we reduce it to 0% accuracy. Adversarial Training" and "Are Generative Classifiers More Robust"). Of the 13 defenses we study, 5 aim at detecting adversarial examples. AutoAttack also cannot be directly applied to "Temporal Dependency" (a speech-to-text model) and "Robust Sparse Fourier Transform" (which is aimed at perturbations of small null We believe AutoAttack is a strong, non-adaptive baseline. The above points illustrate why. We apologize for not clarifying this in the paper. We still view these as white-box attacks. On related work & technical novelty (Reviewer 3). We view the fact that "defenses are broken by existing tech-32 This is what differentiates our work from prior work that proposed and argued for adaptive attacks (e.g., Carlini &
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- (11 more...)
Supplement
In this section, we give an overview of related work in stable neural ODE networks. We also give an overview of common adversarial attacks and recent works that defend against adversarial examples. Stable Neural Network Gradient vanishing and gradient exploding are two well-known phenomena in deep learning [1]. The gradient of the objective function, which strongly relies on the training method as well as the neural network architecture, indicates how sensitive the output is with respect to (w.r.t.) input perturbation. Exploding gradient implies instability of the output w.r.t. the input and thus resulting in a non-robust learning architecture.
- Asia > Singapore (0.05)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Security & Privacy (0.51)
- Government > Military (0.37)
- North America > Canada > Ontario > Toronto (0.14)
- Asia > Singapore (0.05)
- North America > United States > New York (0.04)
- (2 more...)
Adversarial Attacks on Reinforcement Learning-based Medical Questionnaire Systems: Input-level Perturbation Strategies and Medical Constraint Validation
RL-based medical questionnaire systems have shown great potential in medical scenarios. However, their safety and robustness remain unresolved. This study performs a comprehensive evaluation on adversarial attack methods to identify and analyze their potential vulnerabilities. We formulate the diagnosis process as a Markov Decision Process (MDP), where the state is the patient responses and unasked questions, and the action is either to ask a question or to make a diagnosis. We implemented six prevailing major attack methods, including the Fast Gradient Signed Method (FGSM), Projected Gradient Descent (PGD), Carlini & Wagner Attack (C&W) attack, Basic Iterative Method (BIM), DeepFool, and AutoAttack, with seven epsilon values each. To ensure the generated adversarial examples remain clinically plausible, we developed a comprehensive medical validation framework consisting of 247 medical constraints, including physiological bounds, symptom correlations, and conditional medical constraints. We achieved a 97.6% success rate in generating clinically plausible adversarial samples. We performed our experiment on the National Health Interview Survey (NHIS) dataset (https://www.cdc.gov/nchs/nhis/), which consists of 182,630 samples, to predict the participant's 4-year mortality rate. We evaluated our attacks on the AdaptiveFS framework proposed in arXiv:2004.00994. Our results show that adversarial attacks could significantly impact the diagnostic accuracy, with attack success rates ranging from 33.08% (FGSM) to 64.70% (AutoAttack). Our work has demonstrated that even under strict medical constraints on the input, such RL-based medical questionnaire systems still show significant vulnerabilities.
- North America > United States > Indiana (0.04)
- Europe > France (0.04)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Questionnaire & Opinion Survey (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Diagnostic Medicine (1.00)
- Government > Military (1.00)
- (4 more...)