Are Vision-Language Models Safe in the Wild? A Meme-Based Benchmark Study
Lee, DongGeon, Jang, Joonwon, Jeong, Jihae, Yu, Hwanjo
–arXiv.org Artificial Intelligence
Rapid deployment of vision-language models (VLMs) magnifies safety risks, yet most evaluations rely on artificial images. This study asks: How safe are current VLMs when confronted with meme images that ordinary users share? To investigate this question, we introduce MemeSafetyBench, a 50,430-instance benchmark pairing real meme images with both harmful and benign instructions. Using a comprehensive safety taxonomy and LLM-based instruction generation, we assess multiple VLMs across single and multi-turn interactions. We investigate how real-world memes influence harmful outputs, the mitigating effects of conversational context, and the relationship between model scale and safety metrics. Our findings demonstrate that VLMs are more vulnerable to meme-based harmful prompts than to synthetic or typographic images. Memes significantly increase harmful responses and decrease refusals compared to text-only inputs. Though multi-turn interactions provide partial mitigation, elevated vulnerability persists. These results highlight the need for ecologically valid evaluations and stronger safety mechanisms. MemeSafetyBench is publicly available at https://github.com/oneonlee/Meme-Safety-Bench.
arXiv.org Artificial Intelligence
Sep-24-2025
- Country:
- Asia
- Russia (0.14)
- Singapore (0.04)
- South Korea
- Daegu > Daegu (0.04)
- Gyeongsangbuk-do > Pohang (0.04)
- Thailand > Bangkok
- Bangkok (0.04)
- Europe
- Austria > Vienna (0.14)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Italy > Lombardy
- Milan (0.04)
- Middle East > Malta
- Eastern Region > Northern Harbour District > St. Julian's (0.04)
- Russia (0.04)
- Spain > Galicia
- Madrid (0.04)
- Ukraine
- Donetsk Oblast (0.04)
- Luhansk Oblast (0.04)
- North America
- Canada > British Columbia
- Vancouver (0.04)
- Dominican Republic (0.04)
- United States
- Florida > Miami-Dade County
- Miami (0.14)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Utah > Salt Lake County
- Salt Lake City (0.04)
- Washington > King County
- Seattle (0.04)
- Florida > Miami-Dade County
- Canada > British Columbia
- Oceania > Australia
- Asia
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Health & Medicine > Therapeutic Area
- Psychiatry/Psychology (0.46)
- Information Technology > Security & Privacy (1.00)
- Law (1.00)
- Health & Medicine > Therapeutic Area
- Technology: