Multi-round, Chain-of-thought Post-editing for Unfaithful Summaries

Lee, Yi-Hui, Li, Xiangci, Ouyang, Jessica

Jan-19-2025–arXiv.org Artificial Intelligence

Recent large language models (LLMs) have demonstrated a remarkable ability to perform natural language understanding and generation tasks. In this work, we investigate the use of LLMs for evaluating faithfulness in news summarization, finding that it achieves a strong correlation with human judgments. We further investigate LLMs' capabilities as a faithfulness post-editor, experimenting with different chain-of-thought prompts for locating and correcting factual inconsistencies between a generated summary and the source news document and are able to achieve a higher editing success rate than was reported in prior work. We perform both automated and human evaluations of the post-edited summaries, finding that prompting LLMs using chain-of-thought reasoning about factual error types is an effective faithfulness post-editing strategy, performing comparably to fine-tuned post-editing models. We also demonstrate that multiple rounds of post-editing, which has not previously been explored, can be used to gradually improve the faithfulness of summaries whose errors cannot be fully corrected in a single round.

computational linguistic, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

Jan-19-2025

arXiv.org PDF

Add feedback

Country:
- North America
  - Dominican Republic (0.04)
  - United States
    - Texas (0.04)
    - New York (0.04)
    - Washington > King County
      - Seattle (0.04)
  - Canada > Ontario
    - Toronto (0.04)
- Europe
  - Russia (0.05)
  - Ukraine > Crimea (0.04)
  - Germany > Berlin (0.04)
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)
  - Italy > Tuscany
    - Florence (0.04)
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)
- Asia
  - Russia (0.05)
  - Singapore (0.04)
  - China > Hong Kong (0.04)
  - Middle East > UAE
    - Abu Dhabi Emirate > Abu Dhabi (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found