The Power of Summary-Source Alignments
Ernst, Ori, Shapira, Ori, Slobodkin, Aviv, Adar, Sharon, Bansal, Mohit, Goldberger, Jacob, Levy, Ran, Dagan, Ido
–arXiv.org Artificial Intelligence
Multi-document summarization (MDS) is a challenging task, often decomposed to subtasks of salience and redundancy detection, followed by text generation. In this context, alignment of corresponding sentences between a reference summary and its source documents has been leveraged to generate training data for some of the component tasks. Yet, this enabling alignment step has usually been applied heuristically on the sentence level on a limited number of subtasks. In this paper, we propose extending the summary-source alignment framework by (1) applying it at the more fine-grained proposition span level, (2) annotating alignment manually in a multi-document setup, and (3) revealing the great potential of summary-source alignments to yield several datasets for at least six different tasks. Specifically, for each of the tasks, we release a manually annotated test set that was derived automatically from the alignment annotation. We also release development and train sets in the same way, but from automatically derived alignments. Using the datasets, each task is demonstrated with baseline models and corresponding evaluation metrics to spur future research on this broad challenge.
arXiv.org Artificial Intelligence
Jun-2-2024
- Country:
- Oceania > Australia
- North America
- Dominican Republic (0.04)
- United States
- Ohio (0.04)
- Massachusetts (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Nevada > Clark County
- Las Vegas (0.04)
- Colorado > Denver County
- Denver (0.04)
- California > San Francisco County
- San Francisco (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Georgia > Fulton County
- Atlanta (0.04)
- New York > New York County
- New York City (0.04)
- Canada > Ontario
- Toronto (0.04)
- Europe
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Italy > Tuscany
- Florence (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Germany > Hesse
- Darmstadt Region > Wiesbaden (0.04)
- Croatia > Dubrovnik-Neretva County
- Dubrovnik (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Spain > Catalonia
- Asia
- Singapore (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- Japan
- Kyūshū & Okinawa > Kyūshū
- Miyazaki Prefecture > Miyazaki (0.04)
- Honshū > Chūbu
- Aichi Prefecture > Nagoya (0.04)
- Kyūshū & Okinawa > Kyūshū
- China
- Genre:
- Research Report (0.50)
- Industry:
- Media > Music (1.00)
- Leisure & Entertainment > Sports
- Football (1.00)
- Technology: