Bilingual Rhetorical Structure Parsing with Large Parallel Annotations
–arXiv.org Artificial Intelligence
Discourse parsing is a crucial task in natural language processing that aims to reveal the higher-level relations in a text. Despite growing interest in cross-lingual discourse parsing, challenges persist due to limited parallel data and inconsistencies in the Rhetorical Structure Theory (RST) application across languages and corpora. To address this, we introduce a parallel Russian annotation for the large and diverse English GUM RST corpus. Leveraging recent advances, our end-to-end RST parser achieves state-of-the-art results on both English and Russian corpora. It demonstrates effectiveness in both monolingual and bilingual settings, successfully transferring even with limited second-language annotation. To the best of our knowledge, this work is the first to evaluate the potential of cross-lingual end-to-end RST parsing on a manually annotated parallel corpus.
arXiv.org Artificial Intelligence
Sep-23-2024
- Country:
- North America
- Dominican Republic (0.04)
- United States
- Maryland > Baltimore (0.04)
- Texas > Travis County
- Austin (0.14)
- New Mexico > Santa Fe County
- Santa Fe (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Colorado > Denver County
- Denver (0.04)
- California > Los Angeles County
- Los Angeles (0.04)
- Canada > Ontario
- Toronto (0.04)
- Europe
- Italy (0.04)
- Spain > Valencian Community
- Valencia Province > Valencia (0.04)
- Russia > Central Federal District
- Moscow Oblast > Moscow (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Germany > Saarland
- Saarbrücken (0.04)
- Croatia > Dubrovnik-Neretva County
- Dubrovnik (0.04)
- Asia
- South Korea (0.04)
- Russia (0.04)
- China > Hong Kong (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- Japan > Honshū
- Kansai > Osaka Prefecture > Osaka (0.04)
- North America
- Genre:
- Research Report > New Finding (1.00)
- Technology: