Bridging Discourse Treebanks with a Unified Rhetorical Structure Parser
–arXiv.org Artificial Intelligence
We introduce UniRST, the first unified RST-style discourse parser capable of handling 18 treebanks in 11 languages without modifying their relation inventories. To overcome inventory incompatibilities, we propose and evaluate two training strategies: Multi-Head, which assigns separate relation classification layer per inventory, and Masked-Union, which enables shared parameter training through selective label masking. We first benchmark monotreebank parsing with a simple yet effective augmentation technique for low-resource settings. We then train a unified model and show that (1) the parameter efficient Masked-Union approach is also the strongest, and (2) UniRST outperforms 16 of 18 mono-treebank baselines, demonstrating the advantages of a single-model, multilingual end-to-end discourse parsing across diverse resources.
arXiv.org Artificial Intelligence
Oct-9-2025
- Country:
- North America > United States (1.00)
- Europe (1.00)
- Asia > Middle East
- Republic of Türkiye (0.14)
- Genre:
- Research Report > New Finding (0.46)
- Technology: