FineDialFact: A benchmark for Fine-grained Dialogue Fact Verification
Chen, Xiangyan, Li, Yufeng, Gan, Yujian, Zubiaga, Arkaitz, Purver, Matthew
–arXiv.org Artificial Intelligence
Large Language Models (LLMs) are known to produce hallucinations - factually incorrect or fabricated information - which poses significant challenges for many Natural Language Processing (NLP) applications, such as dialogue systems. As a result, detecting hallucinations has become a critical area of research. Current approaches to hallucination detection in dialogue systems primarily focus on verifying the factual consistency of generated responses. However, these responses often contain a mix of accurate, inaccurate or unverifiable facts, making one factual label overly simplistic and coarse-grained. In this paper, we introduce a benchmark, FineDialFact, for fine-grained dialogue fact verification, which involves verifying atomic facts extracted from dialogue responses. To support this, we construct a dataset based on publicly available dialogue datasets and evaluate it using various baseline methods. Experimental results demonstrate that methods incorporating Chain-of-Thought (CoT) reasoning can enhance performance in dialogue fact verification. Despite this, the best F1-score achieved on the HybriDialogue, an open-domain dialogue dataset, is only 0.75, indicating that the benchmark remains a challenging task for future research. Our dataset and code will be public on GitHub.
arXiv.org Artificial Intelligence
Aug-11-2025
- Country:
- Asia > Russia (0.04)
- Europe
- France (0.05)
- Russia > Volga Federal District
- Republic of Bashkortostan (0.14)
- Slovenia (0.04)
- United Kingdom
- England
- Cumbria (0.04)
- East Midlands (0.04)
- Greater London > London (0.04)
- West Midlands (0.04)
- Northern Ireland (0.04)
- Scotland (0.04)
- Wales (0.04)
- England
- North America
- Dominican Republic (0.04)
- United States
- California > Los Angeles County
- Los Angeles (0.04)
- Illinois > Cook County
- Chicago (0.04)
- Minnesota (0.04)
- Oregon (0.04)
- California > Los Angeles County
- Genre:
- Research Report > New Finding (0.48)
- Industry:
- Leisure & Entertainment > Sports > Football (0.93)
- Technology: