Black-box Backdoor Defense via Zero-shot Image Purification
–Neural Information Processing Systems
Backdoor attacks inject poisoned samples into the training data, resulting in the misclassification of the poisoned input during a model's deployment. Defending against such attacks is challenging, especially for real-world black-box models where only query access is permitted. In this paper, we propose a novel defense framework against backdoor attacks through Zero-shot Image Purification (ZIP). Our framework can be applied to poisoned models without requiring internal information about the model or any prior knowledge of the clean/poisoned samples. Our defense framework involves two steps.
Neural Information Processing Systems
May-31-2025, 09:54:10 GMT
- Country:
- North America > United States (0.28)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Technology: