Is AmI (Attacks Meet Interpretability) Robust to Adversarial Examples?

Carlini, Nicholas

arXiv.org Machine Learning 

INTERPRETABILITY" AmI (Attacks meet Interpretability) is an "attribute-steered" defense [3] to detect [1] adversarial examples [2] on facerecognition models.By applying interpretability techniques to a pre-trained neural network, AmI identifies "important" neurons. It then creates a second augmented neural network with the same parameters but increases the weight activations of important neurons. AmI rejects inputs where the original and augmented neural network disagree.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found