Reviews: Robust Detection of Adversarial Attacks by Modeling the Intrinsic Properties of Deep Neural Networks

Oct-8-2024, 08:25:30 GMT–Neural Information Processing Systems

The paper presents an unsupervised learning approach to the problem of adversarial attack detection in the context of deep neural networks. The authors model the intrinsic properties of the networks to detect adversarial inputs. To do so, they employ a Gaussian Mixture Model (GMM) to approximate the hidden state distribution, in practice the state of the fully connected hidden layers, and detect adversarial samples by simply checking that their likelihood is lower than a given threshold. Exhaustive experimental results in different show that the proposed method achieves state-of-the-art performance compared to unsupervised methods while generalizing better than supervised approaches. The paper reads well and is technically sound.

adversarial attack, deep neural network, intrinsic property, (7 more...)

Neural Information Processing Systems

Oct-8-2024, 08:25:30 GMT

Conferences Web Page

Add feedback

Industry:
- Information Technology > Security & Privacy (0.97)
- Government > Military (0.97)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.64)