Marksman Backdoor: Backdoor Attacks with Arbitrary Target Class

Jan-19-2025, 08:07:30 GMT–Neural Information Processing Systems

In recent years, machine learning models have been shown to be vulnerable to backdoor attacks. Under such attacks, an adversary embeds a stealthy backdoor into the trained model such that the compromised models will behave normally on clean inputs but will misclassify according to the adversary's control on maliciously constructed input with a trigger. While these existing attacks are very effective, the adversary's capability is limited: given an input, these attacks can only cause the model to misclassify toward a single pre-defined or target class. In contrast, this paper exploits a novel backdoor attack with a much more powerful payload, denoted as Marksman, where the adversary can arbitrarily choose which target class the model will misclassify given any input during inference. To achieve this goal, we propose to represent the trigger function as a class-conditional generative model and to inject the backdoor in a constrained optimization framework, where the trigger function learns to generate an optimal trigger pattern to attack any target class at will while simultaneously embedding this generative backdoor into the trained model.

arbitrary target class, backdoor attack, target class, (4 more...)

Neural Information Processing Systems

Jan-19-2025, 08:07:30 GMT

Conferences Web Page

Add feedback

Industry:
- Information Technology > Security & Privacy (0.99)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (0.77)
  - Representation & Reasoning > Optimization (0.60)