dir
SeSE: A Structural Information-Guided Uncertainty Quantification Framework for Hallucination Detection in LLMs
Zhao, Xingtao, Peng, Hao, Su, Dingli, Zeng, Xianghua, Liu, Chunyang, Liao, Jinzhi, Yu, Philip S.
Reliable uncertainty quantification (UQ) is essential for deploying large language models (LLMs) in safety-critical scenarios, as it enables them to abstain from responding when uncertain, thereby avoiding ``hallucinating'' falsehoods. However, state-of-the-art UQ methods primarily rely on semantic probability distributions or pairwise distances, overlooking latent semantic structural information that could enable more precise uncertainty estimates. This paper presents Semantic Structural Entropy (SeSE), a principled UQ framework that quantifies the inherent semantic uncertainty of LLMs from a structural information perspective for hallucination detection. SeSE operates in a zero-resource manner and is applicable to both open- and closed-source LLMs, making it an ``off-the-shelf" solution for new models and tasks. Specifically, to effectively model semantic spaces, we first develop an adaptively sparsified directed semantic graph construction algorithm that captures directional semantic dependencies while automatically pruning unnecessary connections that introduce negative interference. We then exploit latent semantic structural information through hierarchical abstraction: SeSE is defined as the structural entropy of the optimal semantic encoding tree, formalizing intrinsic uncertainty within semantic spaces after optimal compression. A higher SeSE value corresponds to greater uncertainty, indicating that LLMs are highly likely to generate hallucinations. In addition, to enhance fine-grained UQ in long-form generation, we extend SeSE to quantify the uncertainty of individual claims by modeling their random semantic interactions, providing theoretically explicable hallucination detection. Extensive experiments across 29 model-dataset combinations show that SeSE significantly outperforms advanced UQ baselines.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Asia > China > Beijing > Beijing (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.92)
- Asia > China > Shanghai > Shanghai (0.04)
- North America > United States > Virginia (0.04)
- Asia > China > Ningxia Hui Autonomous Region > Yinchuan (0.04)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.67)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > United States > Florida > Alachua County > Gainesville (0.14)
- North America > Greenland (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
- Overview (0.67)
- Europe > United Kingdom > North Sea > Southern North Sea (0.04)
- North America > United States > California > Riverside County > Riverside (0.04)
- Europe > United Kingdom > England (0.04)
- (3 more...)
- Research Report > New Finding (0.93)
- Research Report > Experimental Study (0.92)
Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces
Many problems in machine learning reduce to learning a probability distribution (or policy) over sequences of discrete actions so as to maximize a downstream utility function. Examples include generating text sequences to maximize a task-specific metric like BLEU and generating action sequences in reinforcement learning (RL) to maximize expected return.
- North America > United States > Maryland (0.04)
- North America > Canada (0.04)
- Europe > Spain > Canary Islands (0.04)
- Asia > Middle East > Israel (0.04)
We would like to thank the reviewers for their valuable feedback, which we will duly consider and integrate in our
In this paper, we demonstrate that "the decision boundaries of a DNN can only exist as long We clarify the main points raised by the reviewers here below. We further shed more light on the relationship between adv. Nevertheless, we never claim that, within the discr. In fact, we agree that the margin associated to different discr. Overall, however, we firmly believe that the invariant dirs.