Anti-Backdoor Learning: Training Clean Models on Poisoned Data

Oct-11-2024, 10:00:59 GMT–Neural Information Processing Systems

Backdoor attack has emerged as a major security threat to deep neural networks (DNNs). While existing defense methods have demonstrated promising results on detecting or erasing backdoors, it is still not clear whether robust training methods can be devised to prevent the backdoor triggers being injected into the trained model in the first place. In this paper, we introduce the concept of \emph{anti-backdoor learning}, aiming to train \emph{clean} models given backdoor-poisoned data. We frame the overall learning process as a dual-task of learning the \emph{clean} and the \emph{backdoor} portions of data. From this view, we identify two inherent characteristics of backdoor attacks as their weaknesses: 1) the models learn backdoored data much faster than learning with clean data, and the stronger the attack the faster the model converges on backdoored data; 2) the backdoor task is tied to a specific class (the backdoor target class).

anti-backdoor learning, poisoned data, training clean model, (8 more...)

Neural Information Processing Systems

Oct-11-2024, 10:00:59 GMT

Conferences Web Page

Add feedback

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology
  - Security & Privacy (1.00)
  - Artificial Intelligence > Machine Learning
    - Neural Networks (0.60)