SLPL SHROOM at SemEval2024 Task 06: A comprehensive study on models ability to detect hallucination

Fallah, Pouya, Gooran, Soroush, Jafarinasab, Mohammad, Sadeghi, Pouya, Farnia, Reza, Tarabkhah, Amirreza, Taghavi, Zainab Sadat, Sameti, Hossein

Apr-9-2024–arXiv.org Artificial Intelligence

Language models, particularly generative models, are susceptible to hallucinations, generating outputs that contradict factual knowledge or the source text. This study explores methods for detecting hallucinations in three SemEval-2024 Task 6 tasks: Machine Translation, Definition Modeling, and Paraphrase Generation. We evaluate two methods: semantic similarity between the generated text and factual references, and an ensemble of language models that judge each other's outputs. Our results show that semantic similarity achieves moderate accuracy and correlation scores in trial data, while the ensemble method offers insights into the complexities of hallucination detection but falls short of expectations. This work highlights the challenges of hallucination detection and underscores the need for further research in this critical area.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

Apr-9-2024

arXiv.org PDF

Add feedback

Genre:
- Research Report > New Finding (1.00)

Technology:
- Information Technology > Artificial Intelligence > Natural Language
  - Generation (0.49)
  - Large Language Model (0.80)
  - Machine Translation (0.49)
  - Text Processing (0.73)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found