Shadowcast: Stealthy Data Poisoning Attacks against Vision-Language Models
–Neural Information Processing Systems
Vision-Language Models (VLMs) excel in generating textual responses from visual inputs, but their versatility raises security concerns. This study takes the first step in exposing VLMs' susceptibility to data poisoning attacks that can manipulate responses to innocuous, everyday prompts. We introduce Shadowcast, a stealthy data poisoning attack where poison samples are visually indistinguishable from benign images with matching texts. Shadowcast demonstrates effectiveness in two attack types. The first is a traditional Label Attack, tricking VLMs into misidentifying class labels, such as confusing Donald Trump for Joe Biden.
Neural Information Processing Systems
Mar-21-2025, 22:09:25 GMT
- Country:
- Genre:
- Research Report > Experimental Study (0.93)
- Industry:
- Technology: