Roli, Fabio
ImageNet-Patch: A Dataset for Benchmarking Machine Learning Robustness against Adversarial Patches
Pintor, Maura, Angioni, Daniele, Sotgiu, Angelo, Demetrio, Luca, Demontis, Ambra, Biggio, Battista, Roli, Fabio
Adversarial patches are created by solving an optimization problem via gradient descent. Understanding the security of machine-learning models is However, this process is costly as it requires both querying the of paramount importance nowadays, as these algorithms are target model many times and computing the back-propagation used in a large variety of settings, including security-related algorithm until convergence is reached. Hence, it is not possible and mission-critical applications, to extract actionable knowledge to obtain a fast robustness evaluation against adversarial patches from vast amounts of data. Nevertheless, such data-driven without avoiding all the computational costs required by their algorithms are not robust against adversarial perturbations of optimization process. To further exacerbate the problem, adversarial the input data [1, 2, 3, 4]. In particular, attackers can hinder the patches should also be effective under different transformations, performance of classification algorithms by means of adversarial including translation, rotation and scale changes.
SoK: On the Offensive Potential of AI
Schrรถer, Saskia Laura, Apruzzese, Giovanni, Human, Soheil, Laskov, Pavel, Anderson, Hyrum S., Bernroider, Edward W. N., Fass, Aurore, Nassi, Ben, Rimmer, Vera, Roli, Fabio, Salam, Samer, Shen, Ashley, Sunyaev, Ali, Wadwha-Brown, Tim, Wagner, Isabel, Wang, Gang
Our society increasingly benefits from Artificial Intelligence (AI). Unfortunately, more and more evidence shows that AI is also used for offensive purposes. Prior works have revealed various examples of use cases in which the deployment of AI can lead to violation of security and privacy objectives. No extant work, however, has been able to draw a holistic picture of the offensive potential of AI. In this SoK paper we seek to lay the ground for a systematic analysis of the heterogeneous capabilities of offensive AI. In particular we (i) account for AI risks to both humans and systems while (ii) consolidating and distilling knowledge from academic literature, expert opinions, industrial venues, as well as laypeople -- all of which being valuable sources of information on offensive AI. To enable alignment of such diverse sources of knowledge, we devise a common set of criteria reflecting essential technological factors related to offensive AI. With the help of such criteria, we systematically analyze: 95 research papers; 38 InfoSec briefings (from, e.g., BlackHat); the responses of a user study (N=549) entailing individuals with diverse backgrounds and expertise; and the opinion of 12 experts. Our contributions not only reveal concerning ways (some of which overlooked by prior work) in which AI can be offensively used today, but also represent a foothold to address this threat in the years to come.
Backdoor Learning Curves: Explaining Backdoor Poisoning Beyond Influence Functions
Cinร , Antonio Emanuele, Grosse, Kathrin, Vascon, Sebastiano, Demontis, Ambra, Biggio, Battista, Roli, Fabio, Pelillo, Marcello
Backdoor attacks inject poisoning samples during training, with the goal of forcing a machine learning model to output an attacker-chosen class when presented a specific trigger at test time. Although backdoor attacks have been demonstrated in a variety of settings and against different models, the factors affecting their effectiveness are still not well understood. In this work, we provide a unifying framework to study the process of backdoor learning under the lens of incremental learning and influence functions. We show that the effectiveness of backdoor attacks depends on: (i) the complexity of the learning algorithm, controlled by its hyperparameters; (ii) the fraction of backdoor samples injected into the training set; and (iii) the size and visibility of the backdoor trigger. These factors affect how fast a model learns to correlate the presence of the backdoor trigger with the target class. Our analysis unveils the intriguing existence of a region in the hyperparameter space in which the accuracy on clean test samples is still high while backdoor attacks are ineffective, thereby suggesting novel criteria to improve existing defenses.
Robust image classification with multi-modal large language models
Villani, Francesco, Maljkovic, Igor, Lazzaro, Dario, Sotgiu, Angelo, Cinร , Antonio Emanuele, Roli, Fabio
Deep Neural Networks are vulnerable to adversarial examples, i.e., carefully crafted input samples that can cause models to make incorrect predictions with high confidence. To mitigate these vulnerabilities, adversarial training and detection-based defenses have been proposed to strengthen models in advance. However, most of these approaches focus on a single data modality, overlooking the relationships between visual patterns and textual descriptions of the input. In this paper, we propose a novel defense, Multi-Shield, designed to combine and complement these defenses with multi-modal information to further enhance their robustness. Multi-Shield leverages multi-modal large language models to detect adversarial examples and abstain from uncertain classifications when there is no alignment between textual and visual representations of the input. Extensive evaluations on CIFAR-10 and ImageNet datasets, using robust and non-robust image classification models, demonstrate that Multi-Shield can be easily integrated to detect and reject adversarial examples, outperforming the original defenses.
On the Robustness of Adversarial Training Against Uncertainty Attacks
Ledda, Emanuele, Scodeller, Giovanni, Angioni, Daniele, Piras, Giorgio, Cinร , Antonio Emanuele, Fumera, Giorgio, Biggio, Battista, Roli, Fabio
In learning problems, the noise inherent to the task at hand hinders the possibility to infer without a certain degree of uncertainty. Quantifying this uncertainty, regardless of its wide use, assumes high relevance for security-sensitive applications. Within these scenarios, it becomes fundamental to guarantee good (i.e., trustworthy) uncertainty measures, which downstream modules can securely employ to drive the final decision-making process. However, an attacker may be interested in forcing the system to produce either (i) highly uncertain outputs jeopardizing the system's availability or (ii) low uncertainty estimates, making the system accept uncertain samples that would instead require a careful inspection (e.g., human intervention). Therefore, it becomes fundamental to understand how to obtain robust uncertainty estimates against these kinds of attacks. In this work, we reveal both empirically and theoretically that defending against adversarial examples, i.e., carefully perturbed samples that cause misclassification, additionally guarantees a more secure, trustworthy uncertainty estimate under common attack scenarios without the need for an ad-hoc defense strategy. To support our claims, we evaluate multiple adversarial-robust models from the publicly available benchmark RobustBench on the CIFAR-10 and ImageNet datasets.
HO-FMN: Hyperparameter Optimization for Fast Minimum-Norm Attacks
Mura, Raffaele, Floris, Giuseppe, Scionis, Luca, Piras, Giorgio, Pintor, Maura, Demontis, Ambra, Giacinto, Giorgio, Biggio, Battista, Roli, Fabio
Gradient-based attacks are a primary tool to evaluate robustness of machine-learning models. However, many attacks tend to provide overly-optimistic evaluations as they use fixed loss functions, optimizers, step-size schedulers, and default hyperparameters. In this work, we tackle these limitations by proposing a parametric variation of the well-known fast minimum-norm attack algorithm, whose loss, optimizer, step-size scheduler, and hyperparameters can be dynamically adjusted. We re-evaluate 12 robust models, showing that our attack finds smaller adversarial perturbations without requiring any additional tuning. This also enables reporting adversarial robustness as a function of the perturbation budget, providing a more complete evaluation than that offered by fixed-budget attacks, while remaining efficient. We release our open-source code at https://github.com/pralab/HO-FMN.
A Hybrid Training-time and Run-time Defense Against Adversarial Attacks in Modulation Classification
Zhang, Lu, Lambotharan, Sangarapillai, Zheng, Gan, Liao, Guisheng, Demontis, Ambra, Roli, Fabio
Motivated by the superior performance of deep learning in many applications including computer vision and natural language processing, several recent studies have focused on applying deep neural network for devising future generations of wireless networks. However, several recent works have pointed out that imperceptible and carefully designed adversarial examples (attacks) can significantly deteriorate the classification accuracy. In this paper, we investigate a defense mechanism based on both training-time and run-time defense techniques for protecting machine learning-based radio signal (modulation) classification against adversarial attacks. The training-time defense consists of adversarial training and label smoothing, while the run-time defense employs a support vector machine-based neural rejection (NR). Considering a white-box scenario and real datasets, we demonstrate that our proposed techniques outperform existing state-of-the-art technologies.
Countermeasures Against Adversarial Examples in Radio Signal Classification
Zhang, Lu, Lambotharan, Sangarapillai, Zheng, Gan, AsSadhan, Basil, Roli, Fabio
Deep learning algorithms have been shown to be powerful in many communication network design problems, including that in automatic modulation classification. However, they are vulnerable to carefully crafted attacks called adversarial examples. Hence, the reliance of wireless networks on deep learning algorithms poses a serious threat to the security and operation of wireless networks. In this letter, we propose for the first time a countermeasure against adversarial examples in modulation classification. Our countermeasure is based on a neural rejection technique, augmented by label smoothing and Gaussian noise injection, that allows to detect and reject adversarial examples with high accuracy. Our results demonstrate that the proposed countermeasure can protect deep-learning based modulation classification systems against adversarial examples.
Over-parameterization and Adversarial Robustness in Neural Networks: An Overview and Empirical Analysis
Chen, Zhang, Demetrio, Luca, Gupta, Srishti, Feng, Xiaoyi, Xia, Zhaoqiang, Cinร , Antonio Emanuele, Pintor, Maura, Oneto, Luca, Demontis, Ambra, Biggio, Battista, Roli, Fabio
However, having a large parameter space is considered one of the main suspects of the neural networks' vulnerability to adversarial examples-- input samples crafted ad-hoc to induce a desired misclassification. Relevant literature has claimed contradictory remarks in support of and against the robustness of over-parameterized networks. These contradictory findings might be due to the failure of the attack employed to evaluate the networks' robustness. Previous research has demonstrated that depending on the considered model, the algorithm employed to generate adversarial examples may not function properly, leading to overestimating the model's robustness. In this work, we empirically study the robustness of over-parameterized networks against adversarial examples. However, unlike the previous works, we also evaluate the considered attack's reliability to support the results' veracity. Our results show that over-parameterized networks are robust against adversarial attacks as opposed to their under-parameterized counterparts.
SLIFER: Investigating Performance and Robustness of Malware Detection Pipelines
Ponte, Andrea, Trizna, Dmitrijs, Demetrio, Luca, Biggio, Battista, Ogbu, Ivan Tesfai, Roli, Fabio
As a result of decades of research, Windows malware detection is approached through a plethora of techniques. However, there is an ongoing mismatch between academia -- which pursues an optimal performances in terms of detection rate and low false alarms -- and the requirements of real-world scenarios. In particular, academia focuses on combining static and dynamic analysis within a single or ensemble of models, falling into several pitfalls like (i) firing dynamic analysis without considering the computational burden it requires; (ii) discarding impossible-to-analyse samples; and (iii) analysing robustness against adversarial attacks without considering that malware detectors are complemented with more non-machine-learning components. Thus, in this paper we propose SLIFER, a novel Windows malware detection pipeline sequentially leveraging both static and dynamic analysis, interrupting computations as soon as one module triggers an alarm, requiring dynamic analysis only when needed. Contrary to the state of the art, we investigate how to deal with samples resistance to analysis, showing how much they impact performances, concluding that it is better to flag them as legitimate to not drastically increase false alarms. Lastly, we perform a robustness evaluation of SLIFER leveraging content-injections attacks, and we show that, counter-intuitively, attacks are blocked more by YARA rules than dynamic analysis due to byte artifacts created while optimizing the adversarial strategy.