JailBound: Jailbreaking Internal Safety Boundaries of Vision-Language Models

Neural Information Processing Systems 

Vision-Language Models (VLMs) exhibit impressive performance, yet the integration of powerful vision encoders has significantly broadened their attack surface, rendering them increasingly susceptible to jailbreak attacks.