AITopics | perturbation direction

Collaborating Authors

perturbation direction

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On Blackbox Backpropagation and Jacobian Sensing

Krzysztof M. Choromanski, Vikas Sindhwani

Neural Information Processing SystemsNov-21-2025, 12:01:06 GMT

artificial intelligence, jacobian, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California > Los Angeles County > Long Beach (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.41)

Add feedback

Boosting Adversarial Transferability via Residual Perturbation Attack

Peng, Jinjia, Tao, Zeze, Wang, Huibing, Wang, Meng, Wang, Yang

arXiv.org Artificial IntelligenceAug-11-2025

Deep neural networks are susceptible to adversarial examples while suffering from incorrect predictions via imperceptible perturbations. Transfer-based attacks create adversarial examples for surrogate models and transfer these examples to target models under black-box scenarios. Recent studies reveal that adversarial examples in flat loss landscapes exhibit superior transferability to alleviate over-fitting on surrogate models. However, the prior arts overlook the influence of perturbation directions, resulting in limited transferability. In this paper, we propose a novel attack method, named Residual Perturbation Attack (ResP A), relying on the residual gradient as the perturbation direction to guide the adversarial examples toward the flat regions of the loss function. Specifically, ResP A conducts an exponential moving average on the input gradients to obtain the first moment as the reference gradient, which encompasses the direction of historical gradients. Instead of heavily relying on the local flatness that stems from the current gradients as the perturbation direction, ResP A further considers the residual between the current gradient and the reference gradient to capture the changes in the global perturbation direction. The experimental results demonstrate the better transferability of ResP A than the existing typical transfer-based attack methods, while the transferability can be further improved by combining ResP A with the current input transformation methods. The code is available at https://github.com/ZezeTao/ResPA .

adversarial example, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2508.05689

Country: Asia > China (0.46)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Investigating Sensitive Directions in GPT-2: An Improved Baseline and Comparative Analysis of SAEs

Lee, Daniel J., Heimersheim, Stefan

arXiv.org Artificial IntelligenceNov-18-2024

Sensitive directions experiments attempt to understand the computational features of Language Models (LMs) by measuring how much the next token prediction probabilities change by perturbing activations along specific directions. We extend the sensitive directions work by introducing an improved baseline for perturbation directions. We demonstrate that KL divergence for Sparse Autoencoder (SAE) reconstruction errors are no longer pathologically high compared to the improved baseline. We also show that feature directions uncovered by SAEs have varying impacts on model outputs depending on the SAE's sparsity, with lower L0 SAE feature directions exerting a greater influence. Additionally, we find that end-to-end SAE features do not exhibit stronger effects on model outputs compared to traditional SAEs.

activation, original activation, sae, (16 more...)

arXiv.org Artificial Intelligence

2410.12555

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.42)

Add feedback

Reviews: On Blackbox Backpropagation and Jacobian Sensing

Neural Information Processing SystemsOct-8-2024, 04:56:31 GMT

The paper focuses on automatic differentiation of multiple-variable vector-valued functioned in the context of training model parameters from input data. A standard approach in a number of popular environments rely on a propagation of the errors in the evaluation of the model parameters so far using backpropagation of the estimation error as computed of the (local) gradient of the loss function. In this paper, the emphasis is on settings where some of the operators of the model are exogenous blackboxes for which the gradient cannot be computed explicitly and one resorts to finite differencing of the function of interest. Such approach can be prohibitively expensive if the Jacobian does not have some special structure that can be exploited. The strategy pursued in this paper consists in exploiting the relationship between graph colouring and Jacobian estimation.

algorithm, blackbox backpropagation and jacobian sensing, jacobian, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.62)

Add feedback

On Blackbox Backpropagation and Jacobian Sensing

Krzysztof M. Choromanski, Vikas Sindhwani

Neural Information Processing SystemsOct-4-2024, 00:07:00 GMT

From a small number of calls to a given "blackbox" on random input perturbations, we show how to efficiently recover its unknown Jacobian, or estimate the left action of its Jacobian on a given vector. Our methods are based on a novel combination of compressed sensing and graph coloring techniques, and provably exploit structural prior knowledge about the Jacobian such as sparsity and symmetry while being noise robust. We demonstrate efficient backpropagation through noisy blackbox layers in a deep neural net, improved data-efficiency in the task of linearizing the dynamics of a rigid body system, and the generic ability to handle a rich class of input-output dependency structures in Jacobian estimation problems.

graph, jacobian, matrix, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

ADBA:Approximation Decision Boundary Approach for Black-Box Adversarial Attacks

Wang, Feiyang, Zuo, Xingquan, Huang, Hai, Chen, Gang

arXiv.org Artificial IntelligenceJun-12-2024

Many machine learning models are susceptible to adversarial attacks, with decision-based black-box attacks representing the most critical threat in real-world applications. These attacks are extremely stealthy, generating adversarial examples using hard labels obtained from the target machine learning model. This is typically realized by optimizing perturbation directions, guided by decision boundaries identified through query-intensive exact search, significantly limiting the attack success rate. This paper introduces a novel approach using the Approximation Decision Boundary (ADB) to efficiently and accurately compare perturbation directions without precisely determining decision boundaries. The effectiveness of our ADB approach (ADBA) hinges on promptly identifying suitable ADB, ensuring reliable differentiation of all perturbation directions. For this purpose, we analyze the probability distribution of decision boundaries, confirming that using the distribution's median value as ADB can effectively distinguish different perturbation directions, giving rise to the development of the ADBA-md algorithm. ADBA-md only requires four queries on average to differentiate any pair of perturbation directions, which is highly query-efficient. Extensive experiments on six well-known image classifiers clearly demonstrate the superiority of ADBA and ADBA-md over multiple state-of-the-art black-box attacks. The source code is available at https://github.com/BUPTAIOC/ADBA.

decision boundary, perturbation direction, query, (15 more...)

arXiv.org Artificial Intelligence

2406.04998

Country:

Asia > China > Beijing > Beijing (0.04)
Oceania > New Zealand > North Island > Wellington Region > Wellington (0.04)
North America > United States > Texas > Dallas County > Dallas (0.04)
(3 more...)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Visualizing, Rethinking, and Mining the Loss Landscape of Deep Neural Networks

Li, Xin-Chun, Li, Lan, Zhan, De-Chuan

arXiv.org Artificial IntelligenceMay-21-2024

The loss landscape of deep neural networks (DNNs) is commonly considered complex and wildly fluctuated. However, an interesting observation is that the loss surfaces plotted along Gaussian noise directions are almost v-basin ones with the perturbed model lying on the basin. This motivates us to rethink whether the 1D or 2D subspace could cover more complex local geometry structures, and how to mine the corresponding perturbation directions. This paper systematically and gradually categorizes the 1D curves from simple to complex, including v-basin, v-side, w-basin, w-peak, and vvv-basin curves. Notably, the latter two types are already hard to obtain via the intuitive construction of specific perturbation directions, and we need to propose proper mining algorithms to plot the corresponding 1D curves. Combining these 1D directions, various types of 2D surfaces are visualized such as the saddle surfaces and the bottom of a bottle of wine that are only shown by demo functions in previous works. Finally, we propose theoretical insights from the lens of the Hessian matrix to explain the observed several interesting phenomena.

checkpoint, international conference, landscape, (16 more...)

arXiv.org Artificial Intelligence

2405.12493

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China > Ningxia Hui Autonomous Region > Yinchuan (0.04)
Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MISA: Unveiling the Vulnerabilities in Split Federated Learning

Wan, Wei, Ning, Yuxuan, Hu, Shengshan, Xue, Lulu, Li, Minghui, Zhang, Leo Yu, Jin, Hai

arXiv.org Artificial IntelligenceDec-19-2023

\textit{Federated learning} (FL) and \textit{split learning} (SL) are prevailing distributed paradigms in recent years. They both enable shared global model training while keeping data localized on users' devices. The former excels in parallel execution capabilities, while the latter enjoys low dependence on edge computing resources and strong privacy protection. \textit{Split federated learning} (SFL) combines the strengths of both FL and SL, making it one of the most popular distributed architectures. Furthermore, a recent study has claimed that SFL exhibits robustness against poisoning attacks, with a fivefold improvement compared to FL in terms of robustness. In this paper, we present a novel poisoning attack known as MISA. It poisons both the top and bottom models, causing a \textbf{\underline{misa}}lignment in the global model, ultimately leading to a drastic accuracy collapse. This attack unveils the vulnerabilities in SFL, challenging the conventional belief that SFL is robust against poisoning attacks. Extensive experiments demonstrate that our proposed MISA poses a significant threat to the availability of SFL, underscoring the imperative for academia and industry to accord this matter due attention.

acc drop, bottom model, poisoning attack, (16 more...)

arXiv.org Artificial Intelligence

2312.11026

Country: Asia > China > Hubei Province (0.04)

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Combining Adversaries with Anti-adversaries in Training

Zhou, Xiaoling, Yang, Nan, Wu, Ou

arXiv.org Artificial IntelligenceMay-18-2023

Adversarial training is an effective learning technique to improve the robustness of deep neural networks. In this study, the influence of adversarial training on deep learning models in terms of fairness, robustness, and generalization is theoretically investigated under more general perturbation scope that different samples can have different perturbation directions (the adversarial and anti-adversarial directions) and varied perturbation bounds. Our theoretical explorations suggest that the combination of adversaries and anti-adversaries (samples with anti-adversarial perturbations) in training can be more effective in achieving better fairness between classes and a better tradeoff between robustness and generalization in some typical learning scenarios (e.g., noisy label learning and imbalance learning) compared with standard adversarial training. On the basis of our theoretical findings, a more general learning objective that combines adversaries and anti-adversaries with varied bounds on each training sample is presented. Meta learning is utilized to optimize the combination weights. Experiments on benchmark datasets under different learning scenarios verify our theoretical findings and the effectiveness of the proposed methodology.

adversarial training, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2304.1255

Country:

Asia > Middle East > Jordan (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Should Adversarial Attacks Use Pixel p-Norm?

Sen, Ayon, Zhu, Xiaojin, Marshall, Liam, Nowak, Robert

arXiv.org Machine LearningJun-6-2019

Adversarial attacks aim to confound machine learning systems, while remaining virtually imperceptible to humans. Attacks on image classification systems are typically gauged in terms of $p$-norm distortions in the pixel feature space. We perform a behavioral study, demonstrating that the pixel $p$-norm for any $0\le p \le \infty$, and several alternative measures including earth mover's distance, structural similarity index, and deep net embedding, do not fit human perception. Our result has the potential to improve the understanding of adversarial attack and defense strategies.

artificial intelligence, machine learning, participant, (14 more...)

arXiv.org Machine Learning

1906.02439

Genre: Research Report > Experimental Study (0.94)

Industry:

Information Technology > Security & Privacy (0.91)
Government > Military (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback