Trust Me, I'm Wrong: High-Certainty Hallucinations in LLMs

Simhi, Adi, Itzhak, Itay, Barez, Fazl, Stanovsky, Gabriel, Belinkov, Yonatan

arXiv.org Artificial Intelligence 

Large Language Models (LLMs) often generate outputs that lack grounding in real-world facts, a phenomenon known as hallucinations. Prior research has associated hallucinations with model uncertainty, leveraging this relationship for hallucination detection and mitigation. In this paper, we challenge the underlying assumption that all hallucinations are associated with uncertainty. Using knowledge detection and uncertainty measurement methods, Figure 1: Do high-certainty hallucinations exist? An we demonstrate that models can hallucinate illustrative categorization of hallucinations based on a with high certainty even when they have the model's knowledge and certainty. Highlighted is the correct knowledge. We further show that highcertainty phenomenon of high-certainty hallucinations (purple) hallucinations are consistent across - where models confidently produce incorrect outputs, models and datasets, distinctive enough to be even when they have the correct knowledge. While other singled out, and challenge existing mitigation types of hallucinations can potentially be explained by methods. Our findings reveal an overlooked aspect the model not knowing, being mistaken, or uncertain, of hallucinations, emphasizing the need to high-certainty hallucinations are harder to rationalize, understand their origins and improve mitigation making their existence particularly intriguing.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found