Bipartite Graph Pre-training for Unsupervised Extractive Summarization with Graph Convolutional Auto-Encoders
Mao, Qianren, Zhao, Shaobo, Li, Jiarui, Gu, Xiaolei, He, Shizhu, Li, Bo, Li, Jianxin
–arXiv.org Artificial Intelligence
Pre-trained sentence representations are crucial for identifying significant sentences in unsupervised document extractive summarization. However, the traditional two-step paradigm of pre-training and sentence-ranking, creates a gap due to differing optimization objectives. To address this issue, we argue that utilizing pre-trained embeddings derived from a process specifically designed to optimize cohensive and distinctive sentence representations helps rank significant sentences. To do so, we propose a novel graph pre-training auto-encoder to obtain sentence embeddings by explicitly modelling intra-sentential distinctive features and inter-sentential cohesive features through sentence-word bipartite graphs. These pre-trained sentence representations are then utilized in a graph-based ranking algorithm for unsupervised summarization. Our method produces predominant performance for unsupervised summarization frameworks by providing summary-worthy sentence representations. It surpasses heavy BERT- or RoBERTa-based sentence representations in downstream tasks.
arXiv.org Artificial Intelligence
Oct-29-2023
- Country:
- South America > Argentina (0.04)
- Oceania > Australia (0.04)
- North America
- United States > California
- Santa Clara County > Palo Alto (0.04)
- Canada > Alberta
- United States > California
- Asia > China
- Genre:
- Research Report (1.00)
- Technology: