Summarising Historical Text in Modern Languages
Peng, Xutan, Zheng, Yi, Lin, Chenghua, Siddharthan, Advaith
–arXiv.org Artificial Intelligence
We introduce the task of historical text summarisation, where documents in historical forms of a language are summarised in the corresponding modern language. This is a fundamentally important routine to historians and digital humanities researchers but has never been automated. We compile a high-quality gold-standard text summarisation dataset, which consists of historical German and Chinese news from hundreds of years ago summarised in modern German or Chinese. Based on cross-lingual transfer learning techniques, we propose a summarisation model that can be trained even with no cross-lingual (historical to modern) parallel data, and further benchmark it against state-of-the-art algorithms. We report automatic and human evaluations that distinguish the historic to modern language summarisation task from standard cross-lingual summarisation (i.e., modern to modern language), highlight the distinctness and value of our dataset, and demonstrate that our transfer learning approach outperforms standard cross-lingual benchmarks on this task.
arXiv.org Artificial Intelligence
Jan-26-2021
- Country:
- Africa > Middle East
- Morocco (0.04)
- Asia
- China
- Hong Kong (0.04)
- Jiangsu Province > Nanjing (0.04)
- Middle East
- Israel > Jerusalem District
- Jerusalem (0.04)
- Republic of Türkiye (0.04)
- Israel > Jerusalem District
- Russia (0.04)
- China
- Atlantic Ocean > Black Sea (0.04)
- Caspian Sea (0.04)
- Europe
- Hungary (0.04)
- Sweden (0.04)
- Poland > Lower Silesia Province
- Wroclaw (0.04)
- Holy See (0.04)
- Russia (0.04)
- Italy (0.04)
- France (0.04)
- Portugal > Lisbon
- Lisbon (0.14)
- United Kingdom > England
- South Yorkshire > Sheffield (0.04)
- Spain
- Catalonia > Barcelona Province
- Barcelona (0.04)
- Valencian Community > Valencia Province
- Valencia (0.04)
- Catalonia > Barcelona Province
- Germany
- Brandenburg > Potsdam (0.04)
- North Rhine-Westphalia > Upper Bavaria
- Munich (0.04)
- Austria > Vienna (0.04)
- North America
- Canada > British Columbia
- United States
- Maryland > Baltimore (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- New Mexico > Santa Fe County
- Santa Fe (0.04)
- Texas > Travis County
- Austin (0.04)
- Oceania > Australia (0.04)
- Africa > Middle East
- Genre:
- Research Report (1.00)
- Industry:
- Government (1.00)
- Media > News (0.46)
- Technology: