Text-Guided Attention is All You Need for Zero-Shot Robustness in Vision-Language Models Lu Yu1
–Neural Information Processing Systems
CLIP), have attracted widespread attention and adoption across various domains. Nonetheless, CLIP has been observed to be susceptible to adversarial examples. Through experimental analysis, we have observed a phenomenon wherein adversarial perturbations induce shifts in text-guided attention. Building upon this observation, we propose a simple yet effective strategy: Text-Guided Attention for Zero-Shot Robustness (TGA-ZSR). This framework incorporates two components: the Attention Refinement module and the Attention-based Model Constraint module.
Neural Information Processing Systems
Jun-1-2025, 01:20:56 GMT
- Country:
- Europe > Switzerland > Zürich > Zürich (0.14)
- Genre:
- Research Report
- Experimental Study (0.93)
- New Finding (0.93)
- Research Report
- Industry:
- Health & Medicine > Diagnostic Medicine (0.46)
- Information Technology (0.69)
- Technology: