GRILL: Gradient Signal Restoration in Ill-Conditioned Layers to Enhance Adversarial Attacks on Autoencoders

Ramanaik, Chethan Krishnamurthy, Roy, Arjun, Callies, Tobias, Ntoutsi, Eirini

Aug-7-2025–arXiv.org Artificial Intelligence

Adversarial robustness of deep autoencoders (AEs) remains relatively unexplored, even though their non-invertible nature poses distinct challenges. Existing attack algorithms during the optimization of imperceptible, norm-bounded adversarial perturbations to maximize output damage in AEs, often stop at sub-optimal attacks. We observe that the adversarial loss gradient vanishes when backpropagated through ill-conditioned layers. This issue arises from near-zero singular values in the Jacobians of these layers, which weaken the gradient signal during optimization. We introduce GRILL, a technique that locally restores gradient signals in ill-conditioned layers, enabling more effective norm-bounded attacks. Through extensive experiments on different architectures of popular AEs, under both sample-specific and universal attack setups, and across standard and adaptive attack settings, we show that our method significantly increases the effectiveness of our adversarial attacks, enabling a more rigorous evaluation of AE robustness.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

Aug-7-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report (1.00)

Industry:
- Information Technology > Security & Privacy (1.00)
- Government (1.00)

Technology:
- Information Technology
  - Security & Privacy (1.00)
  - Artificial Intelligence > Machine Learning
    - Neural Networks > Deep Learning (0.47)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found