Revealing Perceptible Backdoors, without the Training Set, via the Maximum Achievable Misclassification Fraction Statistic

Xiang, Zhen, Miller, David J., Kesidis, George

Nov-18-2019–arXiv.org Machine Learning

Recently, a special type of data poisoning (DP) attack, known as a backdoor, was proposed. These attacks aim to have a classifier learn to classify to a target class whenever the backdoor pattern is present in a test sample. In this paper, we address post-training detection of perceptible backdoor patterns in DNN image classifiers, wherein the defender does not have access to the poisoned training set, but only to the trained classifier itself, as well as to clean (unpoisoned) examples from the classification domain. This problem is challenging since a perceptible backdoor pattern could be any seemingly innocuous object in a scene, and, without the poisoned training set, we have no hint about the actual backdoor pattern used during training. We identify two important properties of perceptible backdoor patterns, based upon which we propose a novel detector using the maximum achievable misclassification fraction (MAMF) statistic. We detect whether the trained DNN has been backdoor-attacked and infer the source and target classes used for devising the attack. Our detector, with an easily chosen threshold, is evaluated on five datasets, five DNN structures and nine backdoor patterns, and shows strong detection capability. Coupled with an imperceptible backdoor detector, our approach helps achieve detection for all evasive backdoors of interest. I NTRODUCTION Deep neural network (DNN) classifiers have achieved state-of-the-art pattern recognition performance in many research areas such as speech recognition [6], bioinformatics [22], and computer vision [12][13]. However, they have also been shown to be vulnerable to adversarial attacks [23]. This has inspired adversarial learning research, wrestling between attackers and defenders.

backdoor pattern, detection, spatial support, (14 more...)

arXiv.org Machine Learning

Nov-18-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Pennsylvania (0.04)
  - California > San Diego County
    - San Diego (0.04)

Genre:
- Research Report (0.64)

Industry:
- Information Technology > Security & Privacy (0.68)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found