What Are They Talking About? A Benchmark of Knowledge-Grounded Discussion Summarization
Zhou, Weixiao, Zhu, Junnan, Li, Gengyao, Cheng, Xianfu, Liang, Xinnian, Zhai, Feifei, Li, Zhoujun
–arXiv.org Artificial Intelligence
Traditional dialogue summarization primarily focuses on dialogue content, assuming it comprises adequate information for a clear summary. However, this assumption often fails for discussions grounded in shared background, where participants frequently omit context and use implicit references. This results in summaries that are confusing to readers unfamiliar with the background. To address this, we introduce Knowledge-Grounded Discussion Summarization (KGDS), a novel task that produces a supplementary background summary for context and a clear opinion summary with clarified references. To facilitate research, we construct the first KGDS benchmark, featuring news-discussion pairs and expert-created multi-granularity gold annotations for evaluating sub-summaries. We also propose a novel hierarchical evaluation framework with fine-grained and interpretable metrics. Our extensive evaluation of 12 advanced large language models (LLMs) reveals that KGDS remains a significant challenge. The models frequently miss key facts and retain irrelevant ones in background summarization, and often fail to resolve implicit references in opinion summary integration.
arXiv.org Artificial Intelligence
Nov-7-2025
- Country:
- Asia
- Europe > Ireland
- Leinster > County Dublin > Dublin (0.04)
- North America
- Canada > Ontario
- Toronto (0.04)
- Dominican Republic (0.04)
- Mexico > Mexico City
- Mexico City (0.04)
- United States
- Florida > Miami-Dade County
- Miami (0.04)
- Missouri > Jackson County
- Kansas City (0.14)
- Washington > King County
- Seattle (0.04)
- Florida > Miami-Dade County
- Canada > Ontario
- Genre:
- Research Report (0.82)
- Technology: