Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation

Mündler, Niels, He, Jingxuan, Jenko, Slobodan, Vechev, Martin

Oct-1-2023–arXiv.org Artificial Intelligence

Large language models (large LMs) are susceptible to producing text that contains hallucinated content. An important instance of this problem is self-contradiction, where the LM generates two contradictory sentences within the same context. In this work, we present a comprehensive investigation into self-contradiction for various instruction-tuned LMs, covering evaluation, detection, and mitigation. Our analysis reveals the prevalence of self-contradictions when LMs generate text for open-domain topics, e.g., in 17.7% of all sentences produced by ChatGPT. Self-contradiction also complements retrieval-based methods, as a large portion of them (e.g., 35.8% for ChatGPT) cannot be verified using Wikipedia. We then propose a novel prompting-based framework designed to effectively detect and mitigate self-contradictions. Our detector achieves high accuracy, e.g., around 80% F1 score when prompting ChatGPT. The mitigation algorithm iteratively refines the generated text to remove contradictory information while preserving text fluency and informativeness. Importantly, our entire framework is applicable to black-box LMs and does not require external grounded knowledge. Our approach is practically effective and has been released as a push-button tool to benefit the public, available at https://chatprotect.ai/.

alm, chatgpt, freeman, (15 more...)

arXiv.org Artificial Intelligence

Oct-1-2023

arXiv.org PDF

Add feedback

Country:
- Africa > Angola (0.04)
- South America
  - Brazil (0.04)
  - Argentina > Pampas
    - Buenos Aires F.D. > Buenos Aires (0.04)
- Oceania
  - Australia (0.04)
  - New Zealand (0.04)
- North America
  - Cuba (0.14)
  - United States
    - Nebraska (0.04)
    - Montana (0.04)
    - New York (0.04)
    - Louisiana (0.04)
    - Oregon > Multnomah County
      - Portland (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.04)
    - California > Los Angeles County
      - Los Angeles (0.04)
  - Canada > Quebec
    - Capitale-Nationale Region
      - Québec (0.04)
      - Quebec City (0.04)
- Europe
  - Russia (0.04)
  - Spain > Galicia
    - Madrid (0.04)
  - Switzerland > Zürich
    - Zürich (0.14)
  - Austria > Styria
    - Graz (0.04)
  - Germany > Brandenburg
    - Potsdam (0.04)
  - United Kingdom > England
    - Kent (0.04)
  - North Macedonia > Skopje Statistical Region
    - Skopje Municipality > Skopje (0.04)
  - Italy > Calabria
    - Catanzaro Province > Catanzaro (0.04)
  - France > Pays de la Loire
    - Loire-Atlantique > Nantes (0.04)
- Asia
  - Vietnam (0.04)
  - Russia (0.04)
  - Uzbekistan (0.04)
  - South Korea (0.04)
  - Middle East
    - Republic of Türkiye > Batman Province
      - Batman (0.04)
    - Israel > Jerusalem District
      - Jerusalem (0.04)
  - Japan > Honshū
    - Kansai > Osaka Prefecture > Osaka (0.04)
  - Indonesia > Sulawesi
    - South Sulawesi > Makassar (0.04)

Genre:
- Personal (1.00)
- Research Report (0.65)

Industry:
- Leisure & Entertainment > Sports (1.00)
- Transportation (0.66)
- Media
  - Film (1.00)
  - Music (0.93)
  - Television (0.68)
- Government
  - Military > Army (0.92)
  - Regional Government > North America Government
    - United States Government (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found