Discourse Graph Guided Document Translation with Large Language Models
Pham, Viet-Thanh, Wang, Minghan, Liao, Hao-Han, Vu, Thuy-Trang
–arXiv.org Artificial Intelligence
Adapting large language models to full document translation remains challenging due to the difficulty of capturing long-range dependencies and preserving discourse coherence throughout extended texts. While recent agentic machine translation systems mitigate context window constraints through multi-agent orchestration and persistent memory, they require substantial computational resources and are sensitive to memory retrieval strategies. We introduce TransGraph, a discourse-guided framework that explicitly models inter-chunk relationships through structured discourse graphs and selectively conditions each translation segment on relevant graph neighbourhoods rather than relying on sequential or exhaustive context. Across three document-level MT benchmarks spanning six languages and diverse domains, TransGraph consistently surpasses strong baselines in translation quality and terminology consistency while incurring significantly lower token overhead.
arXiv.org Artificial Intelligence
Nov-11-2025
- Country:
- Asia
- China > Hong Kong (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.14)
- Singapore (0.04)
- Thailand > Bangkok
- Bangkok (0.04)
- North America > Canada
- Asia
- Genre:
- Research Report > New Finding (0.46)
- Technology: