Template-based Abstractive Microblog Opinion Summarisation
Bilal, Iman Munire, Wang, Bo, Tsakalidis, Adam, Nguyen, Dong, Procter, Rob, Liakata, Maria
–arXiv.org Artificial Intelligence
We introduce the task of microblog opinion summarisation (MOS) and share a dataset of 3100 gold-standard opinion summaries to facilitate research in this domain. The dataset contains summaries of tweets spanning a 2-year period and covers more topics than any other public Twitter summarisation dataset. Summaries are abstractive in nature and have been created by journalists skilled in summarising news articles following a template separating factual information (main story) from author opinions. Our method differs from previous work on generating gold-standard summaries from social media, which usually involves selecting representative posts and thus favours extractive summarisation models. To showcase the dataset's utility and challenges, we benchmark a range of abstractive and extractive state-of-the-art summarisation models and achieve good performance, with the former outperforming the latter. We also show that fine-tuning is necessary to improve performance and investigate the benefits of using different sample sizes.
arXiv.org Artificial Intelligence
Oct-3-2022
- Country:
- Asia
- Europe
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Italy > Tuscany
- Florence (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Sweden > Vaestra Goetaland
- Gothenburg (0.04)
- United Kingdom > England
- Greater London > London (0.04)
- Belgium > Brussels-Capital Region
- North America
- Canada > British Columbia
- United States
- California > San Diego County
- San Diego (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Massachusetts (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.04)
- New York > New York County
- New York City (0.04)
- California > San Diego County
- Oceania > Australia
- New South Wales > Sydney (0.04)
- Genre:
- Research Report (1.00)
- Industry:
- Government
- Health & Medicine
- Epidemiology (0.70)
- Health Care Providers & Services (0.68)
- Public Health (0.69)
- Therapeutic Area > Psychiatry/Psychology (0.31)
- Information Technology > Services (0.68)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.92)
- Media > News (0.66)
- Technology: