RAG LLMs are Not Safer: A Safety Analysis of Retrieval-Augmented Generation for Large Language Models
An, Bang, Zhang, Shiyue, Dredze, Mark
–arXiv.org Artificial Intelligence
Efforts to ensure the safety of large language models (LLMs) include safety fine-tuning, evaluation, and red teaming. However, despite the widespread use of the Retrieval-Augmented Generation (RAG) framework, AI safety work focuses on standard LLMs, which means we know little about how RAG use cases change a model's safety profile. We conduct a detailed comparative analysis of RAG and non-RAG frameworks with eleven LLMs. We find that RAG can make models less safe and change their safety profile. We explore the causes of this change and find that even combinations of safe models with safe documents can cause unsafe generations. In addition, we evaluate some existing red teaming methods for RAG settings and show that they are less effective than when used for non-RAG settings. Our work highlights the need for safety research and red-teaming methods specifically tailored for RAG LLMs.
arXiv.org Artificial Intelligence
Apr-28-2025
- Country:
- Asia
- China > Jiangsu Province
- Changzhou (0.04)
- Indonesia > Bali (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- Myanmar > Tanintharyi Region
- Dawei (0.04)
- Singapore (0.04)
- Thailand > Bangkok
- Bangkok (0.04)
- China > Jiangsu Province
- Europe
- Denmark > Capital Region
- Copenhagen (0.04)
- France (0.14)
- Latvia > Lubāna Municipality
- Lubāna (0.04)
- Middle East > Malta
- Eastern Region > Northern Harbour District > St. Julian's (0.04)
- Portugal > Guarda
- Guarda (0.04)
- Slovenia > Drava
- Municipality of Benedikt > Benedikt (0.04)
- Denmark > Capital Region
- North America
- Dominican Republic (0.04)
- Mexico > Mexico City
- Mexico City (0.04)
- United States
- Asia
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Education > Educational Setting
- Higher Education (0.46)
- Energy (1.00)
- Government (1.00)
- Health & Medicine (0.67)
- Information Technology > Security & Privacy (1.00)
- Law (1.00)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Media (0.93)
- Education > Educational Setting
- Technology: