Do LLMs Know about Hallucination? An Empirical Investigation of LLM's Hidden States
Duan, Hanyu, Yang, Yi, Tam, Kar Yan
–arXiv.org Artificial Intelligence
Large Language Models (LLMs) can make up answers that are not real, and this is known as hallucination. This research aims to see if, how, and to what extent LLMs are aware of hallucination. More specifically, we check whether and how an LLM reacts differently in its hidden states when it answers a question right versus when it hallucinates. To do this, we introduce an experimental framework which allows examining LLM's hidden states in different hallucination situations. Building upon this framework, we conduct a series of experiments with language models in the LLaMA family (Touvron et al., 2023). Our empirical findings suggest that LLMs react differently when processing a genuine response versus a fabricated one. We then apply various model interpretation techniques to help understand and explain the findings better. Moreover, informed by the empirical observations, we show great potential of using the guidance derived from LLM's hidden representation space to mitigate hallucination. We believe this work provides insights into how LLMs produce hallucinated answers and how to make them occur less often.
arXiv.org Artificial Intelligence
Feb-15-2024
- Country:
- Asia
- China
- Anhui Province (0.04)
- Hong Kong (0.04)
- Hubei Province > Wuhan (0.04)
- Middle East > Jordan (0.04)
- Myanmar > Mandalay Region
- Mandalay (0.04)
- China
- North America > United States
- California > Los Angeles County
- Los Angeles (0.04)
- Colorado (0.04)
- Florida > Duval County
- Jacksonville (0.04)
- Massachusetts (0.04)
- Nebraska > Lancaster County
- Lincoln (0.04)
- Nevada > Clark County
- Las Vegas (0.04)
- New York > Queens County
- New York City (0.04)
- Texas > Dallas County
- Dallas (0.04)
- California > Los Angeles County
- South America > Peru (0.04)
- Asia
- Genre:
- Research Report
- Experimental Study (0.97)
- New Finding (1.00)
- Research Report
- Industry:
- Leisure & Entertainment > Sports
- Football (1.00)
- Media > Film (1.00)
- Leisure & Entertainment > Sports
- Technology: