Backdoor Defense through Self-Supervised and Generative Learning

Sabolić, Ivan, Grubišić, Ivan, Šegvić, Siniša

Sep-2-2024–arXiv.org Artificial Intelligence

Backdoor attacks change a small portion of training data by introducing hand-crafted triggers and rewiring the corresponding labels towards a desired target class. Training on such data injects a backdoor which causes malicious inference in selected test samples. Most defenses mitigate such attacks through various modifications of the discriminative learning procedure. In contrast, this paper explores an approach based on generative modelling of per-class distributions in a self-supervised representation space. Interestingly, these representations get either preserved or heavily disturbed under recent backdoor attacks. In both cases, we find that per-class generative models allow to detect poisoned data and cleanse the dataset. Experiments show that training on cleansed dataset greatly reduces the attack success rate and retains the accuracy on benign inputs.

backdoor attack, dataset, target class, (12 more...)

arXiv.org Artificial Intelligence

Sep-2-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - United States > Louisiana
    - Orleans Parish > New Orleans (0.04)
  - Canada
    - Ontario > Toronto (0.04)
    - Alberta > Census Division No. 15
      - Improvement District No. 9 > Banff (0.04)
- Europe
  - France > Hauts-de-France
    - Nord > Lille (0.04)
  - Croatia > Zagreb County
    - Zagreb (0.04)

Genre:
- Research Report (1.00)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology
  - Security & Privacy (1.00)
  - Artificial Intelligence
    - Vision (1.00)
    - Machine Learning > Neural Networks
      - Deep Learning (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found