Pre-trained Language Models Return Distinguishable Probability Distributions to Unfaithfully Hallucinated Texts
–arXiv.org Artificial Intelligence
In this work, we show the pre-trained language models return distinguishable generation probability and uncertainty distribution to unfaithfully hallucinated texts, regardless of their size and structure. By examining 24 models on 6 data sets, we find out that 88-98% of cases return statistically significantly distinguishable generation probability and uncertainty distributions. Using this general phenomenon, we showcase a hallucination-reducing training algorithm. Our algorithm outperforms other baselines by achieving higher faithfulness metrics while maintaining sound general text quality measures.
arXiv.org Artificial Intelligence
Sep-25-2024
- Country:
- Asia > Singapore (0.04)
- North America > United States
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Texas > Travis County
- Austin (0.04)
- Minnesota > Hennepin County
- Genre:
- Research Report (1.00)
- Technology: