Learning to Verify Summary Facts with Fine-Grained LLM Feedback

Oh, Jihwan, Choi, Jeonghwan, Kim, Nicole Hee-Yeon, Yun, Taewon, Song, Hwanjun

Dec-14-2024–arXiv.org Artificial Intelligence

Training automatic summary fact verifiers often faces the challenge of a lack of human-labeled data. In this paper, we explore alternative way of leveraging Large Language Model (LLM) generated feedback to address the inherent limitation of using human-labeled data. We introduce FineSumFact, a large-scale dataset containing fine-grained factual feedback on summaries. We employ 10 distinct LLMs for diverse summary generation and Llama-3-70B-Instruct for feedback. We utilize this dataset to fine-tune the lightweight open-source model Llama-3-8B-Instruct, optimizing resource efficiency while maintaining high performance. Our experimental results reveal that the model trained on extensive LLM-generated datasets surpasses that trained on smaller human-annotated datasets when evaluated using human-generated test sets. Fine-tuning fact verification models with LLM feedback can be more effective and cost-efficient than using human feedback. The dataset is available at https://github.com/DISL-Lab/FineSumFact.

category, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

Dec-14-2024

arXiv.org PDF

Add feedback

Country:
- Europe > United Kingdom (0.04)

Genre:
- Research Report (1.00)

Industry:
- Health & Medicine (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)