Cost-Effective Hallucination Detection for LLMs

Valentin, Simon, Fu, Jinmiao, Detommaso, Gianluca, Xu, Shaoyuan, Zappella, Giovanni, Wang, Bryan

Aug-9-2024–arXiv.org Machine Learning

Despite their impressive capabilities, large language models (LLMs) can be prone to generating hallucinations -- undesirable outputs that are incorrect, unfaithful, or inconsistent with respect to the inputs (or the output itself) [1]. These unreliable behaviors pose significant risks for adopting LLMs in real-world applications. Challenges in detecting hallucinations lie, among other things, in hallucinations taking different forms, being context-dependent and sometimes being in conflict with other desirable properties of generated text [2, 3]. Hallucinations may be harmless in some contexts, but can be undesired or potentially dangerous in other applications (e.g., erroneous medical advice). Detecting and quantifying hallucination risk is thus a critical capability to enable safe applications of LLMs and improve generated outputs. Prior work has proposed various approaches for detecting and mitigating hallucinations in LLM-generated outputs, including verifying faithfulness to inputs [4], assessing internal coherence [5], consulting external knowledge sources [6], and quantifying model uncertainty [2, 3, 7, 8]. However, deploying these methods in production settings is far from trivial due to several challenges: First, there is limited comparative evaluation illuminating how different detection methods perform. Second, existing approaches for detecting hallucinations differ greatly in their computational demands, and guidelines are lacking on cost-effectiveness trade-offs to inform method selection for real-world applications with constraints. Third, hallucination detection in the real world often requires careful consideration of risks and false positive/negative trade-offs, requiring methods to provide well-calibrated probability scores.

arxiv preprint arxiv, hallucination, llm, (14 more...)

arXiv.org Machine Learning

Aug-9-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.04)
- Europe > Germany
  - Berlin (0.04)

Genre:
- Research Report > New Finding (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Performance Analysis
    - Accuracy (0.48)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found