AITopics | robustness evaluation

Collaborating Authors

robustness evaluation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Combining equation (4) with equation (5), we have: L(fθ) nY

Neural Information Processing SystemsApr-25-2026, 06:52:26 GMT

A.1 Theoretical Proof The following is proof for Theorem 1 and 2 on Upper Bound on Lipschitz Constant of a DNN with Gaussian Distributed Weights, which is inspired by [67-69]. Let A be an (N n) matrix whose elements are independent standard normal random variables. Then, N n E[λmin(A)] E[λmax(A)] N+ n, where λmin and λmax denote the minimum and maximum singular values of A, respectively, and E[ ] represents the expected value. This can be extended to convolutional neural networks (CNN). Using doubly block circulant matrix the convolution operation can be represented by matrix multiplication.

artificial intelligence, machine learning, robustness, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

GREAT Score: Global Robustness Evaluation of Adversarial Perturbation using Generative Models

Neural Information Processing SystemsMar-20-2026, 01:53:19 GMT

Current studies on adversarial robustness mainly focus on aggregating \textit{local} robustness results from a set of data samples to evaluate and rank different models. However, the local statistics may not well represent the true \textit{global} robustness of the underlying unknown data distribution. To address this challenge, this paper makes the first attempt to present a new framework, called \textit{GREAT Score}, for global robustness evaluation of adversarial perturbation using generative models. Formally, GREAT Score carries the physical meaning of a global statistic capturing a mean certified attack-proof perturbation level over all samples drawn from a generative model. For finite-sample evaluation, we also derive a probabilistic guarantee on the sample complexity and the difference between the sample mean and the true mean. GREAT Score has several advantages: (1) Robustness evaluations using GREAT Score are efficient and scalable to large models, by sparing the need of running adversarial attacks. In particular, we show high correlation and significantly reduced computation cost of GREAT Score when compared to the attack-based model ranking on RobustBench \cite{croce2021robustbench}.

artificial intelligence, great score, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

45326c2df19fee16fc1ebc44941fea8e-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 03:12:04 GMT

artificial intelligence, machine learning, robustness, (19 more...)

Neural Information Processing Systems

Country:

Asia > China > Hong Kong > Sha Tin (0.04)
North America > United States (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

91ffdc5e2f12436d99914418e38d0a09-Paper-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 19:31:17 GMT

adversarial example, evaluation, robustness evaluation, (11 more...)

Neural Information Processing Systems

Country:

Europe > Italy > Sardinia > Cagliari (0.04)
North America > United States (0.04)
Europe > Italy > Liguria > Genoa (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Accurate, reliable and fast robustness evaluation

Neural Information Processing SystemsDec-25-2025, 16:37:02 GMT

Throughout the past five years, the susceptibility of neural networks to minimal adversarial perturbations has moved from a peculiar phenomenon to a core issue in Deep Learning. Despite much attention, however, progress towards more robust models is significantly impaired by the difficulty of evaluating the robustness of neural network models. Today's methods are either fast but brittle (gradient-based attacks), or they are fairly reliable but slow (score-and decision-based attacks). We here develop a new set of gradient-based adversarial attacks which (a) are more reliable in the face of gradient-masking than other gradient-based attacks, (b) perform better and are more query efficient than current state-of-the-art gradient-based attacks, (c) can be flexibly adapted to a wide range of adversarial criteria and (d) require virtually no hyperparameter tuning. These findings are carefully validated across a diverse set of six different models and hold for L0, L1, L2 and Linf in both targeted as well as untargeted scenarios. Implementations will soon be available in all major toolboxes (Foolbox, CleverHans and ART). We hope that this class of attacks will make robustness evaluations easier and more reliable, thus contributing to more signal in the search for more robust machine learning models.

name change, reliable and fast robustness evaluation, robustness evaluation, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

A Sensor-Aware Phenomenological Framework for Lidar Degradation Simulation and SLAM Robustness Evaluation

Felix, Doumegna Mawuto Koudjo, Yu, Xianjia, Zou, Zhuo, Westerlund, Tomi

arXiv.org Artificial IntelligenceDec-10-2025

Abstract--Lidar-based SLAM systems are highly sensitive to adverse conditions such as occlusion, noise, and field-of-view (FoV) degradation, yet existing robustness evaluation methods either lack physical grounding or do not capture sensor-specific behavior . This paper presents a sensor-aware, phenomenological framework for simulating interpretable lidar degradations directly on real point clouds, enabling controlled and reproducible SLAM stress testing. Unlike image-derived corruption benchmarks (e.g., SemanticKITTI-C) or simulation-only approaches (e.g., lidarsim), the proposed system preserves per-point geometry, intensity, and temporal structure while applying structured dropout, FoV reduction, Gaussian noise, occlusion masking, sparsification, and motion distortion. Experimental validation across three lidar architectures and five state-of-the-art SLAM systems reveals distinct robustness patterns shaped by sensor design and environmental context. The open-source implementation provides a practical foundation for benchmarking lidar-based SLAM under physically meaningful degradation scenarios.

artificial intelligence, machine learning, real time system, (17 more...)

arXiv.org Artificial Intelligence

2512.08653

Country: Europe > Finland (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (0.70)
Information Technology > Artificial Intelligence > Vision (0.69)
Information Technology > Artificial Intelligence > Machine Learning (0.69)
Information Technology > Architecture > Real Time Systems (0.50)

Add feedback

RO-Bench: Large-scale robustness evaluation of MLLMs with text-driven counterfactual videos

Yang, Zixi, Li, Jiapeng, Diao, Muxi, Jing, Yinuo, Liang, Kongming

arXiv.org Artificial IntelligenceOct-13-2025

Recently, Multi-modal Large Language Models (MLLMs) have demonstrated significant performance across various video understanding tasks. However, their robustness, particularly when faced with manipulated video content, remains largely unexplored. In this paper, we introduce Ro-Bench, the first benchmark for evaluating MLLMs on dynamic out-of-distribution (OOD) counterfactual video test sets. Ro-Bench incorporates high-quality, diverse and temporally relevant video data, by editing Style, Object, Background and their compositions. We evaluated eight recent video MLLMs and found that current models exhibit substantial performance degradation on Ro-Bench when exposed to counterfactual video content. Furthermore, we demonstrate that fine-tuning MLLMs with counterfactual data enhances robustness, achieving a 21.73% performance increase on Ro-Bench and a 12.78% improvement across 20 tasks in the MVBench dataset. These findings underscore the effectiveness of counterfactual data in enhancing the video understanding ability of MLLMs. The code and data will be released shortly.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2510.08936

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

GREAT Score: Global Robustness Evaluation of Adversarial Perturbation using Generative Models

Neural Information Processing SystemsOct-10-2025, 00:52:18 GMT

Current studies on adversarial robustness mainly focus on aggregating local robustness results from a set of data samples to evaluate and rank different models.

generative model, perturbation, robustness, (16 more...)

Neural Information Processing Systems

Country:

Asia > China > Hong Kong > Sha Tin (0.04)
North America > United States (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language (0.67)

Add feedback

Indicators of Attack Failure: Debugging and Improving Optimization of Adversarial Examples

Neural Information Processing SystemsAug-17-2025, 01:01:49 GMT

Let us finally discuss the constraints in Eq.

artificial intelligence, evaluation, machine learning, (12 more...)

Neural Information Processing Systems

Country:

Europe > Italy > Sardinia > Cagliari (0.04)
North America > United States (0.04)
Europe > Italy > Liguria > Genoa (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

RDI: An adversarial robustness evaluation metric for deep neural networks based on model statistical features

Song, Jialei, Zuo, Xingquan, Wang, Feiyang, Huang, Hai, Zhang, Tianle

arXiv.org Artificial IntelligenceMay-27-2025

Deep neural networks (DNNs) are highly susceptible to adversarial samples, raising concerns about their reliability in safety-critical tasks. Currently, methods of evaluating adversarial robustness are primarily categorized into attack-based and certified robustness evaluation approaches. The former not only relies on specific attack algorithms but also is highly time-consuming, while the latter due to its analytical nature, is typically difficult to implement for large and complex models. A few studies evaluate model robustness based on the model's decision boundary, but they suffer from low evaluation accuracy. To address the aforementioned issues, we propose a novel adversarial robustness evaluation metric, Robustness Difference Index (RDI), which is based on model statistical features. RDI draws inspiration from clustering evaluation by analyzing the intra-class and inter-class distances of feature vectors separated by the decision boundary to quantify model robustness. It is attack-independent and has high computational efficiency. Experiments show that, RDI demonstrates a stronger correlation with the gold-standard adversarial robustness metric of attack success rate (ASR). The average computation time of RDI is only 1/30 of the evaluation method based on the PGD attack. Our open-source code is available at: https://github.com/BUPTAIOC/RDI.

artificial intelligence, machine learning, robustness, (16 more...)

arXiv.org Artificial Intelligence

2504.18556

Country: Asia > China (0.16)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback