ClashEval: Quantifying the tug-of-war between an LLM's internal prior and external evidence

Neural Information Processing Systems 

GPT -4o, on this dataset and find that LLMs are susceptible to adopting incorrect retrieved content, overriding their own correct prior knowledge over 60% of the time. However, the more unrealistic the retrieved content is (i.e. more deviated from

Similar Docs  Excel Report  more

TitleSimilaritySource
None found