Detecting LLM Fact-conflicting Hallucinations Enhanced by Temporal-logic-based Reasoning
Li, Ningke, Song, Yahui, Wang, Kailong, Li, Yuekang, Shi, Ling, Liu, Yi, Wang, Haoyu
–arXiv.org Artificial Intelligence
Abstract--Large language models (LLMs) face the challenge of hallucinations - outputs that seem coherent but are actually incorrect. A particularly damaging type is fact-conflicting hallucination (FCH), where generated content contradicts established facts. Addressing FCH presents three main challenges: 1) Automatically constructing and maintaining large-scale benchmark datasets is difficult and resource-intensive; 2) Generating complex and efficient test cases that the LLM has not been trained on - especially those involving intricate temporal features - is challenging, yet crucial for eliciting hallucinations; and 3) Validating the reasoning behind LLM outputs is inherently difficult, particularly with complex logical relationships, as it requires transparency in the model's decision-making process. LLMs are tested using these cases through template-based prompts, which require them to generate both answers and reasoning steps. T o validate the reasoning, we propose two semantic-aware oracles that compare the semantic structure of LLM outputs to the ground truths. Key insights reveal that LLMs struggle with out-of-distribution knowledge and logical reasoning. These findings highlight the importance of continued efforts to detect and mitigate hallucinations in LLMs. Large Language Models (LLMs) have revolutionized language processing, demonstrating impressive text generation and comprehension capabilities with diverse applications. However, despite their growing use, LLMs face significant security and privacy challenges [1], [2], [3], [4], [5], which affect their overall effectiveness and reliability . A critical issue is the phenomenon of hallucination, where LLMs generate outputs that are coherent but factually incorrect or irrelevant. This tendency to produce misleading information compromises the safety and usability of LLM-based systems. This paper focuses on fact-conflicting hallucina tion (FCH), the most prominent form of hallucination in LLMs. FCH occurs when LLMs generate content that directly contradicts established facts. For instance, as illustrated in Figure 1, an LLM incorrectly asserts that " Haruki Murakami won the Nobel Prize in Literature in 2016 ", whereas the fact is that "Haruki Murakami has not won the Nobel Prize, though he has received numerous other literary awards ". Such inaccuracies can significantly lead to user confusion and undermine the trust and reliability that are crucial for LLM applications. N. Li, K. Wang, and H. Wang are with Huazhong University of Science and T echnology, China. Song is with the National University of Singapore, Singapore. Li is with the University of New South Wales, Australia.
arXiv.org Artificial Intelligence
Feb-18-2025
- Country:
- Asia > Singapore
- Central Region > Singapore (0.24)
- North America > United States (1.00)
- Oceania > Australia
- New South Wales (0.24)
- Asia > Singapore
- Industry:
- Information Technology (0.66)
- Leisure & Entertainment (0.92)
- Media > Television (0.46)
- Technology: