LEAF: Learning and Evaluation Augmented by Fact-Checking to Improve Factualness in Large Language Models

Tran, Hieu, Wang, Junda, Ting, Yujan, Huang, Weijing, Chen, Terrence

Oct-30-2024–arXiv.org Artificial Intelligence

Large language models (LLMs) have shown remarkable capabilities in various natural language processing tasks, yet they often struggle with maintaining factual accuracy, particularly in knowledge-intensive domains like healthcare. This study introduces LEAF: Learning and Evaluation Augmented by Fact-Checking, a novel approach designed to enhance the factual reliability of LLMs, with a focus on medical question answering (QA). LEAF utilizes a dual strategy to enhance the factual accuracy of responses from models such as Llama 3 70B Instruct and Llama 3 8B Instruct. The first strategy, Fact-Check-Then-RAG, improves Retrieval-Augmented Generation (RAG) by incorporating fact-checking results to guide the retrieval process without updating model parameters. The second strategy, Learning from Fact-Checks via Self-Training, involves supervised fine-tuning (SFT) on fact-checked responses or applying Simple Preference Optimization (SimPO) with fact-checking as a ranking mechanism, both updating LLM parameters from supervision. These findings suggest that integrating fact-checked responses whether through RAG enhancement or self-training enhances the reliability and factual correctness of LLM outputs, offering a promising solution for applications where information accuracy is crucial.

accuracy, dataset, drainage, (15 more...)

arXiv.org Artificial Intelligence

Oct-30-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - United States > Massachusetts
    - Hampshire County > Amherst (0.14)
  - Mexico > Mexico City
    - Mexico City (0.04)
  - Canada > British Columbia
    - Metro Vancouver Regional District > Vancouver (0.04)

Genre:
- Research Report
  - New Finding (1.00)
  - Promising Solution (0.86)

Industry:
- Health & Medicine
  - Pharmaceuticals & Biotechnology (1.00)
  - Diagnostic Medicine (1.00)
  - Consumer Health (0.93)
  - Therapeutic Area
    - Oncology (1.00)
    - Infections and Infectious Diseases (0.93)
    - Cardiology/Vascular Diseases (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found