Measuring text summarization factuality using atomic facts entailment metrics in the context of retrieval augmented generation

Aug-27-2024–arXiv.org Artificial Intelligence

The use of large language models (LLMs) has significantly increased since the introduction of ChatGPT in 2022, demonstrating their value across various applications. However, a major challenge for enterprise and commercial adoption of LLMs is their tendency to generate inaccurate information, a phenomenon known as "hallucination." This project proposes a method for estimating the factuality of a summary generated by LLMs when compared to a source text. Our approach utilizes Naive Bayes classification to assess the accuracy of the content produced.

arxiv, atomic fact, evaluation, (13 more...)

arXiv.org Artificial Intelligence

Aug-27-2024

arXiv.org PDF

Add feedback

Country:
- Africa > South Africa (0.04)
- North America > United States
  - New York (0.04)
- Europe
  - United Kingdom > Northern Ireland (0.04)
  - France (0.04)
  - Middle East > Malta
    - Eastern Region > Northern Harbour District > St. Julian's (0.04)
- Asia
  - Middle East > Jordan (0.05)
  - Thailand (0.04)

Genre:
- Research Report (0.44)

Industry:
- Transportation > Air (0.47)
- Leisure & Entertainment > Sports (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning
    - Neural Networks > Deep Learning (1.00)
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found