Natural and Adversarial Error Detection using Invariance to Image Transformations

Bahat, Yuval, Irani, Michal, Shakhnarovich, Gregory

Feb-1-2019–arXiv.org Machine Learning

We propose an approach to distinguish between correct and incorrect image classifications. Our approach can detect misclassifications which either occurunintentionally ("natural errors"), or due to intentional adversarial attacks ("adversarial errors"),both in a single unified framework. Our approach is based on the observation that correctly classified images tend to exhibit robust and consistent classifications under certain image transformations (e.g., horizontal flip, small image translation, etc.). In contrast, incorrectly classified images (whether due to adversarial errors or natural errors)tend to exhibit large variations in classification resultsunder such transformations. Our approach does not require any modifications or retraining of the classifier, hence can be applied to any pre-trained classifier. We further use state of the art targeted adversarial attacks to demonstrate that even when the adversary has full knowledge of our method, the adversarial distortion needed for bypassing our detector is no longer imperceptible tothe human eye. Our approach obtains state-of-the-art results compared to previous adversarial detectionmethods, surpassing them by a large margin.

classifier, deep learning, neural network, (19 more...)

arXiv.org Machine Learning

Feb-1-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.68)

Genre:
- Research Report (0.50)

Industry:
- Government > Military (1.00)
- Information Technology > Security & Privacy (0.87)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning > Neural Networks
      - Deep Learning (0.68)
    - Vision (1.00)
  - Sensing and Signal Processing (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found