Element-aware Summarization with Large Language Models: Expert-aligned Evaluation and Chain-of-Thought Method
Wang, Yiming, Zhang, Zhuosheng, Wang, Rui
–arXiv.org Artificial Intelligence
Automatic summarization generates concise summaries that contain key ideas of source documents. As the most mainstream datasets for the news sub-domain, CNN/DailyMail and BBC XSum have been widely used for performance benchmarking. However, the reference summaries of those datasets turn out to be noisy, mainly in terms of factual hallucination and information redundancy. To address this challenge, we first annotate new expert-writing Element-aware test sets following the "Lasswell Communication Model" proposed by Lasswell (1948), allowing reference summaries to focus on more fine-grained news elements objectively and comprehensively. Utilizing the new test sets, we observe the surprising zero-shot summary ability of LLMs, which addresses the issue of the inconsistent results between human preference and automatic evaluation metrics of LLMs' zero-shot summaries in prior work. Further, we propose a Summary Chain-of-Thought (SumCoT) technique to elicit LLMs to generate summaries step by step, which helps them integrate more fine-grained details of source documents into the final summaries that correlate with the human writing mindset. Experimental results show our method outperforms state-of-the-art fine-tuned PLMs and zero-shot LLMs by +4.33/+4.77 in ROUGE-L on the two datasets, respectively. Dataset and code are publicly available at https://github.com/Alsace08/SumCoT.
arXiv.org Artificial Intelligence
May-22-2023
- Country:
- Africa > Ethiopia
- Addis Ababa > Addis Ababa (0.04)
- Asia
- Europe
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- United Kingdom
- England
- Buckinghamshire > Milton Keynes (0.04)
- Dorset > Bournemouth (0.04)
- Greater London > London (0.14)
- Merseyside > Liverpool (0.04)
- North Yorkshire (0.04)
- West Midlands (0.04)
- Northern Ireland (0.04)
- England
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- France > Île-de-France
- Romania (0.04)
- Western Europe (0.04)
- Isle of Man (0.04)
- Middle East > Cyprus (0.04)
- Portugal > Lisbon
- Lisbon (0.04)
- Norway (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- Spain
- Catalonia > Barcelona Province
- Barcelona (0.04)
- Galicia > Madrid (0.04)
- Catalonia > Barcelona Province
- Italy > Tuscany
- Florence (0.04)
- Germany > Berlin (0.04)
- Bulgaria (0.04)
- Belgium > Brussels-Capital Region
- North America
- Canada > Quebec
- Montreal (0.04)
- United States
- California > Los Angeles County
- Long Beach (0.04)
- Illinois > Cook County
- Chicago (0.04)
- Indiana > Marion County
- Lawrence (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Maryland (0.04)
- Michigan (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Pennsylvania (0.04)
- California > Los Angeles County
- Canada > Quebec
- Oceania > Australia (0.05)
- South America > Chile
- Africa > Ethiopia
- Genre:
- Research Report > New Finding (0.48)
- Industry:
- Government
- Health & Medicine (1.00)
- Law > Criminal Law (1.00)
- Law Enforcement & Public Safety
- Corrections (0.93)
- Crime Prevention & Enforcement (1.00)
- Leisure & Entertainment > Sports
- Soccer (1.00)
- Media (1.00)
- Transportation > Ground (0.93)
- Technology: