AITopics | ensemble defense

Adversarial attacks can deceive neural networks by adding tiny perturbations to their input data. Ensemble defenses, which are trained to minimize attack transferability among sub-models, offer a promising research direction to improve robustness against such attacks while maintaining a high accuracy on natural inputs. We discover, however, that recent state-of-the-art (SOTA) adversarial attack strategies cannot reliably evaluate ensemble defenses, sizeably overestimating their robustness. This paper identifies the two factors that contribute to this behavior. First, these defenses form ensembles that are notably difficult for existing gradient-based method to attack, due to gradient obfuscation. Second, ensemble defenses diversify sub-model gradients, presenting a challenge to defeat all sub-models simultaneously, simply summing their contributions may counteract the overall attack objective; yet, we observe that ensemble may still be fooled despite most sub-models being correct. We therefore introduce MORA, a model-reweighing attack to steer adversarial example synthesis by reweighing the importance of sub-model gradients. MORA finds that recent ensemble defenses all exhibit varying degrees of overestimated robustness. Comparing it against recent SOTA white-box attacks, it can converge orders of magnitude faster while achieving higher attack success rates across all ensemble models examined with three different ensemble modes (i.e, ensembling by either softmax, voting or logits).

ensemble robustness evaluation, model reweighing attack, name change, (5 more...)

Neural Information Processing Systems

Industry:

Information Technology > Security & Privacy (0.84)
Government > Military (0.84)

Technology:

Information Technology > Security & Privacy (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.59)

Add feedback

MORA: Improving Ensemble Robustness Evaluation with Model-Reweighing Attack Y unrui Y u

Neural Information Processing SystemsAug-17-2025, 15:02:55 GMT

Adversarial attacks can deceive neural networks by adding tiny perturbations to their input data.

artificial intelligence, machine learning, robustness, (18 more...)

Neural Information Processing Systems

Country:

Asia > Macao (0.14)
Asia > China > Guangdong Province > Shenzhen (0.04)
North America > Canada > Ontario > Toronto (0.04)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

MORA: Improving Ensemble Robustness Evaluation with Model-Reweighing Attack Y unrui Y u

Neural Information Processing SystemsAug-17-2025, 15:02:51 GMT

Adversarial attacks can deceive neural networks by adding tiny perturbations to their input data.

artificial intelligence, machine learning, robustness, (18 more...)

Neural Information Processing Systems

Country:

Asia > Macao (0.14)
Asia > China > Guangdong Province > Shenzhen (0.05)
North America > Canada > Ontario > Toronto (0.04)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

MORA: Improving Ensemble Robustness Evaluation with Model Reweighing Attack

Neural Information Processing SystemsJan-18-2025, 12:29:33 GMT

Adversarial attacks can deceive neural networks by adding tiny perturbations to their input data. Ensemble defenses, which are trained to minimize attack transferability among sub-models, offer a promising research direction to improve robustness against such attacks while maintaining a high accuracy on natural inputs. We discover, however, that recent state-of-the-art (SOTA) adversarial attack strategies cannot reliably evaluate ensemble defenses, sizeably overestimating their robustness. This paper identifies the two factors that contribute to this behavior. First, these defenses form ensembles that are notably difficult for existing gradient-based method to attack, due to gradient obfuscation.

ensemble defense, ensemble robustness evaluation, model reweighing attack, (3 more...)

Neural Information Processing Systems

Industry:

Information Technology > Security & Privacy (0.87)
Government > Military (0.87)

Technology:

Information Technology > Security & Privacy (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.40)

Add feedback

Game Theoretic Mixed Experts for Combinational Adversarial Machine Learning

Rathbun, Ethan, Mahmood, Kaleel, Ahmad, Sohaib, Ding, Caiwen, van Dijk, Marten

arXiv.org Artificial IntelligenceApr-29-2023

Recent advances in adversarial machine learning have shown that defenses considered to be robust are actually susceptible to adversarial attacks which are specifically customized to target their weaknesses. These defenses include Barrage of Random Transforms (BaRT), Friendly Adversarial Training (FAT), Trash is Treasure (TiT) and ensemble models made up of Vision Transformers (ViTs), Big Transfer models and Spiking Neural Networks (SNNs). We first conduct a transferability analysis, to demonstrate the adversarial examples generated by customized attacks on one defense, are not often misclassified by another defense. This finding leads to two important questions. First, how can the low transferability between defenses be utilized in a game theoretic framework to improve the robustness? Second, how can an adversary within this framework develop effective multi-model attacks? In this paper, we provide a game-theoretic framework for ensemble adversarial attacks and defenses. Our framework is called Game theoretic Mixed Experts (GaME). It is designed to find the Mixed-Nash strategy for both a detector based and standard defender, when facing an attacker employing compositional adversarial attacks. We further propose three new attack algorithms, specifically designed to target defenses with randomized transformations, multi-model voting schemes, and adversarial detector architectures. These attacks serve to both strengthen defenses generated by the GaME framework and verify their robustness against unforeseen attacks. Overall, our framework and analyses advance the field of adversarial machine learning by yielding new insights into compositional attack and defense formulations.

adversarial example, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2211.14669

Country:

North America > United States > Connecticut (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report > New Finding (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (0.75)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

MORA: Improving Ensemble Robustness Evaluation with Model-Reweighing Attack

Yu, Yunrui, Gao, Xitong, Xu, Cheng-Zhong

arXiv.org Artificial IntelligenceNov-15-2022

Adversarial attacks can deceive neural networks by adding tiny perturbations to their input data. Ensemble defenses, which are trained to minimize attack transferability among sub-models, offer a promising research direction to improve robustness against such attacks while maintaining a high accuracy on natural inputs. We discover, however, that recent state-of-the-art (SOTA) adversarial attack strategies cannot reliably evaluate ensemble defenses, sizeably overestimating their robustness. This paper identifies the two factors that contribute to this behavior. First, these defenses form ensembles that are notably difficult for existing gradient-based method to attack, due to gradient obfuscation. Second, ensemble defenses diversify sub-model gradients, presenting a challenge to defeat all sub-models simultaneously, simply summing their contributions may counteract the overall attack objective; yet, we observe that ensemble may still be fooled despite most sub-models being correct. We therefore introduce MORA, a model-reweighing attack to steer adversarial example synthesis by reweighing the importance of sub-model gradients. MORA finds that recent ensemble defenses all exhibit varying degrees of overestimated robustness. Comparing it against recent SOTA white-box attacks, it can converge orders of magnitude faster while achieving higher attack success rates across all ensemble models examined with three different ensemble modes (i.e., ensembling by either softmax, voting or logits). In particular, most ensemble defenses exhibit near or exactly 0% robustness against MORA with $\ell^\infty$ perturbation within 0.02 on CIFAR-10, and 0.01 on CIFAR-100. We make MORA open source with reproducible results and pre-trained models; and provide a leaderboard of ensemble defenses under various attack strategies.

artificial intelligence, machine learning, robustness, (17 more...)

arXiv.org Artificial Intelligence

2211.08008

Country:

Asia > Macao (0.14)
North America > Canada > Ontario > Toronto (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)

Add feedback

Denoising and Verification Cross-Layer Ensemble Against Black-box Adversarial Attacks

Chow, Ka-Ho, Wei, Wenqi, Wu, Yanzhao, Liu, Ling

arXiv.org Machine LearningAug-20-2019

--Deep neural networks (DNNs) have demonstrated impressive performance on many challenging machine learning tasks. However, DNNs are vulnerable to adversarial inputs generated by adding maliciously crafted perturbations to the benign inputs. As a growing number of attacks have been reported to generate adversarial inputs of varying sophistication, the defense-attack arms race has been accelerated. MODEF intelligently combines unsupervised model denoising ensemble with supervised model verification ensemble by quantifying model diversity, aiming to boost the robustness of the target model against adversarial examples. Evaluated using eleven representative attacks on popular benchmark datasets, we show that MODEF achieves remarkable defense success rates, compared with existing defense methods, and provides a superior capability of repairing adversarial inputs and making correct predictions with high accuracy in the presence of black-box attacks. The recent advances in deep neural networks (DNNs) have powered numerous applications in different domains due to their outstanding performance compared to traditional machine learning techniques. However, it has been shown that DNNs can be easily fooled by adversarial inputs [1], making them become a double-edged sword as the vulnerability of DNNs to adversarial attacks has posed serious threats to many security-critical applications, such as biometric authentication and autonomous driving. As a number of defenses are being proposed, more attacks of varying sophistication have been put forward, accelerating the defense-attack arms race. Some even argue that designing new attacks requires much less efforts than developing effective defenses. Thus, improving the robustness and defensibility against adversarial attacks is crucial. Adversarial examples are generated by maliciously perturbing benign examples sent to the target DNN model through querying its prediction API, aiming to fool and mislead the target model to misclassify by producing incorrect predictions randomly (untargeted attack) or purposefully (targeted attack).

artificial intelligence, ensemble, machine learning, (19 more...)

arXiv.org Machine Learning

1908.07667

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (1.00)
Government (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

Clipping free attacks against artificial neural networks

Addad, Boussad, Kodjabashian, Jerome, Meyer, Christophe

arXiv.org Machine LearningMar-26-2018

During the last years, a remarkable breakthrough has been made in AI domain thanks to artificial deep neural networks that achieved a great success in many machine learning tasks in computer vision, natural language processing, speech recognition, malware detection and so on. However, they are highly vulnerable to easily crafted adversarial examples. Many investigations have pointed out this fact and different approaches have been proposed to generate attacks while adding a limited perturbation to the original data. The most robust known method so far is the so called C&W attack [1]. Nonetheless, a countermeasure known as feature squeezing coupled with ensemble defense showed that most of these attacks can be destroyed [6]. In this paper, we present a new method we call Centered Initial Attack (CIA) whose advantage is twofold : first, it insures by construction the maximum perturbation to be smaller than a threshold fixed beforehand, without the clipping process that degrades the quality of attacks. Second, it is robust against recently introduced defenses such as feature squeezing, JPEG encoding and even against a voting ensemble of defenses. While its application is not limited to images, we illustrate this using five of the current best classifiers on ImageNet dataset among which two are adversarialy retrained on purpose to be robust against attacks. With a fixed maximum perturbation of only 1.5% on any pixel, around 80% of attacks (targeted) fool the voting ensemble defense and nearly 100% when the perturbation is only 6%. While this shows how it is difficult to defend against CIA attacks, the last section of the paper gives some guidelines to limit their impact.

artificial intelligence, classifier, machine learning, (16 more...)

arXiv.org Machine Learning

1803.09468

Genre: Research Report (0.83)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback