Hallucination Detection in LLMs Using Spectral Features of Attention Maps
Binkowski, Jakub, Janiak, Denis, Sawczyn, Albert, Gabrys, Bogdan, Kajdanowicz, Tomasz
–arXiv.org Artificial Intelligence
Large Language Models (LLMs) have demonstrated remarkable performance across various tasks but remain prone to hallucinations. Detecting hallucinations is essential for safety-critical applications, and recent methods leverage attention map properties to this end, though their effectiveness remains limited. In this work, we investigate the spectral features of attention maps by interpreting them as adjacency matrices of graph structures. We propose the $\text{LapEigvals}$ method, which utilises the top-$k$ eigenvalues of the Laplacian matrix derived from the attention maps as an input to hallucination detection probes. Empirical evaluations demonstrate that our approach achieves state-of-the-art hallucination detection performance among attention-based methods. Extensive ablation studies further highlight the robustness and generalisation of $\text{LapEigvals}$, paving the way for future advancements in the hallucination detection domain.
arXiv.org Artificial Intelligence
Feb-24-2025
- Country:
- Africa > Mali (0.04)
- Oceania > Australia
- North America
- Canada (0.04)
- United States
- Massachusetts > Middlesex County
- Cambridge (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Florida > Miami-Dade County
- Miami (0.04)
- Massachusetts > Middlesex County
- Mexico > Mexico City
- Mexico City (0.04)
- Europe
- France (0.04)
- Monaco (0.04)
- Poland > Lower Silesia Province
- Wroclaw (0.04)
- Netherlands > South Holland
- Dordrecht (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Asia
- Singapore (0.04)
- British Indian Ocean Territory > Diego Garcia (0.04)
- Thailand > Bangkok
- Bangkok (0.04)
- Middle East
- Jordan (0.04)
- Saudi Arabia > Asir Province
- Abha (0.04)
- Genre:
- Research Report
- New Finding (1.00)
- Experimental Study (0.68)
- Research Report
- Industry:
- Government > Regional Government (0.46)
- Leisure & Entertainment (0.46)
- Technology: