What Are They Talking About? A Benchmark of Knowledge-Grounded Discussion Summarization

Zhou, Weixiao, Zhu, Junnan, Li, Gengyao, Cheng, Xianfu, Liang, Xinnian, Zhai, Feifei, Li, Zhoujun

Nov-7-2025–arXiv.org Artificial Intelligence

Traditional dialogue summarization primarily focuses on dialogue content, assuming it comprises adequate information for a clear summary. However, this assumption often fails for discussions grounded in shared background, where participants frequently omit context and use implicit references. This results in summaries that are confusing to readers unfamiliar with the background. To address this, we introduce Knowledge-Grounded Discussion Summarization (KGDS), a novel task that produces a supplementary background summary for context and a clear opinion summary with clarified references. To facilitate research, we construct the first KGDS benchmark, featuring news-discussion pairs and expert-created multi-granularity gold annotations for evaluating sub-summaries. We also propose a novel hierarchical evaluation framework with fine-grained and interpretable metrics. Our extensive evaluation of 12 advanced large language models (LLMs) reveals that KGDS remains a significant challenge. The models frequently miss key facts and retain irrelevant ones in background summarization, and often fail to resolve implicit references in opinion summary integration.

computational linguistic, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

Nov-7-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.93)
- Asia > Middle East
  - UAE (0.46)

Genre:
- Research Report (0.82)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found