Not enough data to create a plot.
Try a different view from the menu above.
Baswa, Tanay
No Free Lunch with Guardrails
Kumar, Divyanshu, Birur, Nitin Aravind, Baswa, Tanay, Agarwal, Sahil, Harshangi, Prashanth
As large language models (LLMs) and generative AI become widely adopted, guardrails have emerged as a key tool to ensure their safe use. However, adding guardrails isn't without tradeoffs; stronger security measures can reduce usability, while more flexible systems may leave gaps for adversarial attacks. In this work, we explore whether current guardrails effectively prevent misuse while maintaining practical utility. We introduce a framework to evaluate these tradeoffs, measuring how different guardrails balance risk, security, and usability, and build an efficient guardrail. Our findings confirm that there is no free lunch with guardrails; strengthening security often comes at the cost of usability. To address this, we propose a blueprint for designing better guardrails that minimize risk while maintaining usability. We evaluate various industry guardrails, including Azure Content Safety, Bedrock Guardrails, OpenAI's Moderation API, Guardrails AI, Nemo Guardrails, and Enkrypt AI guardrails. Additionally, we assess how LLMs like GPT-4o, Gemini 2.0-Flash, Claude 3.5-Sonnet, and Mistral Large-Latest respond under different system prompts, including simple prompts, detailed prompts, and detailed prompts with chain-of-thought (CoT) reasoning. Our study provides a clear comparison of how different guardrails perform, highlighting the challenges in balancing security and usability.
VERA: Validation and Enhancement for Retrieval Augmented systems
Birur, Nitin Aravind, Baswa, Tanay, Kumar, Divyanshu, Loya, Jatan, Agarwal, Sahil, Harshangi, Prashanth
Large language models (LLMs) exhibit remarkable capabilities but often produce inaccurate responses, as they rely solely on their embedded knowledge. Retrieval-Augmented Generation (RAG) enhances LLMs by incorporating an external information retrieval system, supplying additional context along with the query to mitigate inaccuracies for a particular context. However, accuracy issues still remain, as the model may rely on irrelevant documents or extrapolate incorrectly from its training knowledge. To assess and improve the performance of both the retrieval system and the LLM in a RAG framework, we propose \textbf{VERA} (\textbf{V}alidation and \textbf{E}nhancement for \textbf{R}etrieval \textbf{A}ugmented systems), a system designed to: 1) Evaluate and enhance the retrieved context before response generation, and 2) Evaluate and refine the LLM-generated response to ensure precision and minimize errors. VERA employs an evaluator-cum-enhancer LLM that first checks if external retrieval is necessary, evaluates the relevance and redundancy of the retrieved context, and refines it to eliminate non-essential information. Post-response generation, VERA splits the response into atomic statements, assesses their relevance to the query, and ensures adherence to the context. Our experiments demonstrate VERA's remarkable efficacy not only in improving the performance of smaller open-source models, but also larger state-of-the art models. These enhancements underscore VERA's potential to produce accurate and relevant responses, advancing the state-of-the-art in retrieval-augmented language modeling. VERA's robust methodology, combining multiple evaluation and refinement steps, effectively mitigates hallucinations and improves retrieval and response processes, making it a valuable tool for applications demanding high accuracy and reliability in information generation. .