Structured Document Translation via Format Reinforcement Learning
Song, Haiyue, Eschbach-Dymanus, Johannes, Kaing, Hour, Honda, Sumire, Tanaka, Hideki, Buschbeck, Bianka, Utiyama, Masao
–arXiv.org Artificial Intelligence
Recent works on structured text translation remain limited to the sentence level, as they struggle to effectively handle the complex document-level XML or HTML structures. To address this, we propose \textbf{Format Reinforcement Learning (FormatRL)}, which employs Group Relative Policy Optimization on top of a supervised fine-tuning model to directly optimize novel structure-aware rewards: 1) TreeSim, which measures structural similarity between predicted and reference XML trees and 2) Node-chrF, which measures translation quality at the level of XML nodes. Additionally, we apply StrucAUC, a fine-grained metric distinguishing between minor errors and major structural failures. Experiments on the SAP software-documentation benchmark demonstrate improvements across six metrics and an analysis further shows how different reward functions contribute to improvements in both structural and translation quality.
arXiv.org Artificial Intelligence
Dec-5-2025
- Country:
- Asia
- China (0.04)
- Japan (0.04)
- Macao (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- Thailand > Bangkok
- Bangkok (0.04)
- Europe
- Belgium (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- France > Provence-Alpes-Côte d'Azur
- Alpes-Maritimes > Nice (0.04)
- Germany (0.04)
- Italy > Tuscany
- Florence (0.04)
- Portugal > Lisbon
- Lisbon (0.04)
- Slovenia (0.04)
- United Kingdom (0.04)
- North America
- Dominican Republic (0.04)
- United States
- Florida > Miami-Dade County
- Miami (0.04)
- Illinois > Cook County
- Chicago (0.04)
- Washington > King County
- Seattle (0.04)
- Florida > Miami-Dade County
- Asia
- Genre:
- Research Report > New Finding (0.68)
- Technology: