Modeling Unified Semantic Discourse Structure for High-quality Headline Generation
Xu, Minghui, Fei, Hao, Li, Fei, Wu, Shengqiong, Sun, Rui, Teng, Chong, Ji, Donghong
–arXiv.org Artificial Intelligence
Headline generation aims to summarize a long document with a short, catchy title that reflects the main idea. This requires accurately capturing the core document semantics, which is challenging due to the lengthy and background information-rich na ture of the texts. In this work, We propose using a unified semantic discourse structure (S3) to represent document semantics, achieved by combining document-level rhetorical structure theory (RST) trees with sentence-level abstract meaning representation (AMR) graphs to construct S3 graphs. The hierarchical composition of sentence, clause, and word intrinsically characterizes the semantic meaning of the overall document. We then develop a headline generation framework, in which the S3 graphs are encoded as contextual features. To consolidate the efficacy of S3 graphs, we further devise a hierarchical structure pruning mechanism to dynamically screen the redundant and nonessential nodes within the graph. Experimental results on two headline generation datasets demonstrate that our method outperforms existing state-of-art methods consistently. Our work can be instructive for a broad range of document modeling tasks, more than headline or summarization generation.
arXiv.org Artificial Intelligence
Mar-23-2024
- Country:
- North America > United States (0.68)
- Genre:
- Research Report (0.40)
- Industry:
- Health & Medicine (0.46)
- Law > Criminal Law (0.46)
- Leisure & Entertainment > Sports
- Boxing (0.68)
- Technology: