ClashEval: Quantifying the tug-of-war between an LLM's internal prior and external evidence Eric Wu* Department of Biomedical Data Science Department of Electrical Engineering Stanford University
–Neural Information Processing Systems
Retrieval augmented generation (RAG) is frequently used to mitigate hallucinations and provide up-to-date knowledge for large language models (LLMs). However, given that document retrieval is an imprecise task and sometimes results in erroneous or even harmful content being presented in context, this raises the question of how LLMs handle retrieved information: If the provided content is incorrect, does the model know to ignore it, or does it recapitulate the error? Conversely, when the model's initial response is incorrect, does it always know to use the retrieved information to correct itself, or does it insist on its wrong prior response? To answer this, we curate a dataset of over 1200 questions across six domains (e.g., drug dosages, Olympic records, locations) along with content relevant to answering each question. We further apply precise perturbations to the answers in the content that range from subtle to blatant errors.
Neural Information Processing Systems
Mar-19-2025, 20:04:03 GMT
- Country:
- North America > United States > California > Santa Clara County (0.14)
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (1.00)
- Research Report
- Industry:
- Health & Medicine (0.82)
- Leisure & Entertainment > Sports
- Olympic Games (0.34)
- Technology: