AITopics | defense technique

Collaborating Authors

defense technique

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Supplementary Material for Understanding and Improving Ensemble Adversarial Defense

Neural Information Processing SystemsFeb-16-2026, 16:46:47 GMT

They are used to test the proposed enhancement approach iGA T. In general, ADP employs an ensemble by averaging, i.e., (C 1) ( C 1) Adversarial examples are generated to compute the losses by using the PGD attack. Our main theorem builds on a supporting Lemma 2.1. We start from the cross-entropy loss curvature measured by Eq. The above new expression of T (x) helps bound the difference between h(x) and h(x). Note that these three cases are mutually exclusive.

artificial intelligence, classifier, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Understanding and Improving Ensemble Adversarial Defense

Neural Information Processing SystemsFeb-16-2026, 16:46:44 GMT

Guided by this theory, we propose an effective approach to improve ensemble adversarial defense, named interactive global adversarial training (iGA T).

artificial intelligence, data mining, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
Asia > China (0.04)

Industry: Information Technology > Security & Privacy (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Data Science > Data Mining (0.68)

Add feedback

b83aac23b9528732c23cc7352950e880-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-13-2026, 18:56:17 GMT

defense technique, noise, self-adversarial attack, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.39)

Add feedback

Understanding and Improving Ensemble Adversarial Defense

Neural Information Processing SystemsOct-10-2025, 23:39:07 GMT

Guided by this theory, we propose an effective approach to improve ensemble adversarial defense, named interactive global adversarial training (iGA T).

artificial intelligence, data mining, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
Asia > China (0.04)

Industry: Information Technology > Security & Privacy (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Data Science > Data Mining (0.68)

Add feedback

Due to the time constraints of the rebuttal, we limited

Neural Information Processing SystemsAug-20-2025, 00:13:05 GMT

We cannot thank the reviewers enough for their valuable feedback on our work. Reviewers 1 and 2: Combine guess loss with additive noise. Most recent advances in adversarial defense methods address "black-box attacks" performed by a The latter incorporates adversarial examples during training to increase the model's robustness to the attack. Therefore the reconstructed image can serve as an adversarially perturbed example of the non-adversarial input image. Reviewer 3: Novelty is not enough as most of the proposed solution or observations are already published.

defense technique, noise, self-adversarial attack, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.39)

Add feedback

Towards Stable and Robust AdderNets (Supplementary Material) Minjing Dong 1,2, Yunhe Wang

Neural Information Processing SystemsAug-15-2025, 01:56:32 GMT

As shown in Table 1, the stability of adversarial robustness is evaluated under different inference settings. We first show whether the shuffle of test set influences the performance. In the main body, we mainly focus on the comparison with CNN since we want to highlight the natural robustness of AdderNet compared to CNN under the same setting. We further evaluate the performance of A WN on CNNs. As shown in Table 3, with the involvement of A WN, CNN obtains slight better adversarial robustness.

adversarial robustness, normalization, robustness, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.98)

Add feedback

A Comprehensive Study of Jailbreak Attack versus Defense for Large Language Models

Xu, Zihao, Liu, Yi, Deng, Gelei, Li, Yuekang, Picek, Stjepan

arXiv.org Artificial IntelligenceMay-17-2024

Large Language Models (LLMS) have increasingly become central to generating content with potential societal impacts. Notably, these models have demonstrated capabilities for generating content that could be deemed harmful. To mitigate these risks, researchers have adopted safety training techniques to align model outputs with societal values to curb the generation of malicious content. However, the phenomenon of "jailbreaking", where carefully crafted prompts elicit harmful responses from models, persists as a significant challenge. This research conducts a comprehensive analysis of existing studies on jailbreaking LLMs and their defense techniques. We meticulously investigate nine attack techniques and seven defense techniques applied across three distinct language models: Vicuna, LLama, and GPT-3.5 Turbo. We aim to evaluate the effectiveness of these attack and defense techniques. Our findings reveal that existing white-box attacks underperform compared to universal techniques and that including special tokens in the input significantly affects the likelihood of successful attacks. This research highlights the need to concentrate on the security facets of LLMs. Additionally, we contribute to the field by releasing our datasets and testing framework, aiming to foster further research into LLM security. We believe these contributions will facilitate the exploration of security measures within this domain.

arxiv preprint arxiv, language model, preprint arxiv, (15 more...)

arXiv.org Artificial Intelligence

2402.13457

Country:

Oceania > Australia > New South Wales (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
Europe > Netherlands > South Holland > Delft (0.04)
Asia > Singapore (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government (1.00)
Media (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Understanding and Improving Ensemble Adversarial Defense

Deng, Yian, Mu, Tingting

arXiv.org Artificial IntelligenceNov-2-2023

The strategy of ensemble has become popular in adversarial defense, which trains multiple base classifiers to defend against adversarial attacks in a cooperative manner. Despite the empirical success, theoretical explanations on why an ensemble of adversarially trained classifiers is more robust than single ones remain unclear. To fill in this gap, we develop a new error theory dedicated to understanding ensemble adversarial defense, demonstrating a provable 0-1 loss reduction on challenging sample sets in an adversarial defense scenario. Guided by this theory, we propose an effective approach to improve ensemble adversarial defense, named interactive global adversarial training (iGAT). The proposal includes (1) a probabilistic distributing rule that selectively allocates to different base classifiers adversarial examples that are globally challenging to the ensemble, and (2) a regularization term to rescue the severest weaknesses of the base classifiers. Being tested over various existing ensemble adversarial defense techniques, iGAT is capable of boosting their performance by increases up to 17% evaluated using CIFAR10 and CIFAR100 datasets under both white-box and black-box attacks.

adversarial example, base classifier, international conference, (13 more...)

arXiv.org Artificial Intelligence

2310.18477

Country:

Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
Asia > China (0.04)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Data Science (0.93)

Add feedback

Adversarial attacks and defenses on ML- and hardware-based IoT device fingerprinting and identification

Sánchez, Pedro Miguel Sánchez, Celdrán, Alberto Huertas, Bovet, Gérôme, Pérez, Gregorio Martínez

arXiv.org Artificial IntelligenceDec-30-2022

In the last years, the number of IoT devices deployed has suffered an undoubted explosion, reaching the scale of billions. However, some new cybersecurity issues have appeared together with this development. Some of these issues are the deployment of unauthorized devices, malicious code modification, malware deployment, or vulnerability exploitation. This fact has motivated the requirement for new device identification mechanisms based on behavior monitoring. Besides, these solutions have recently leveraged Machine and Deep Learning techniques due to the advances in this field and the increase in processing capabilities. In contrast, attackers do not stay stalled and have developed adversarial attacks focused on context modification and ML/DL evaluation evasion applied to IoT device identification solutions. This work explores the performance of hardware behavior-based individual device identification, how it is affected by possible context- and ML/DL-focused attacks, and how its resilience can be improved using defense techniques. In this sense, it proposes an LSTM-CNN architecture based on hardware performance behavior for individual device identification. Then, previous techniques have been compared with the proposed architecture using a hardware performance dataset collected from 45 Raspberry Pi devices running identical software. The LSTM-CNN improves previous solutions achieving a +0.96 average F1-Score and 0.8 minimum TPR for all devices. Afterward, context- and ML/DL-focused adversarial attacks were applied against the previous model to test its robustness. A temperature-based context attack was not able to disrupt the identification. However, some ML/DL state-of-the-art evasion attacks were successful. Finally, adversarial training and model distillation defense techniques are selected to improve the model resilience to evasion attacks, without degrading its performance.

artificial intelligence, identification, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2212.14677

Country: