Exposing the Cracks: Vulnerabilities of Retrieval-Augmented LLM-based Machine Translation
Sun, Yanming, Zhan, Runzhe, Cheang, Chi Seng, Wu, Han, Liu, Xuebo, Niu, Yuyao, Ye, Fengying, Lan, Kaixin, Chao, Lidia S., Wong, Derek F.
–arXiv.org Artificial Intelligence
REtrieval-Augmented LLM-based Machine Translation (REAL-MT) shows promise for knowledge-intensive tasks like idiomatic translation, but its reliability under noisy retrieval, a common challenge in real-world deployment, remains poorly understood. To address this gap, we propose a noise synthesis framework and new metrics to systematically evaluate REAL-MT's reliability across high-, medium-, and low-resource language pairs. Using both open-and closed-sourced models, including standard LLMs and large reasoning models (LRMs), we find that models heavily rely on retrieved context, and this dependence is significantly more detrimental in low-resource language pairs, producing nonsensical translations. Although LRMs possess enhanced reasoning capabilities, they show no improvement in error correction and are even more susceptible to noise, tending to rationalize incorrect contexts. Attention analysis reveals a shift from the source idiom to noisy content, while confidence increases despite declining accuracy, indicating poor self-monitoring. To mitigate these issues, we investigate training-free and fine-tuning strategies, which improve robustness at the cost of performance in clean contexts, revealing a fundamental trade-off. Our findings highlight the limitations of current approaches, underscoring the need for self-verifying integration mechanisms.
arXiv.org Artificial Intelligence
Nov-18-2025
- Country:
- Asia
- China
- Guangdong Province > Shenzhen (0.04)
- Heilongjiang Province > Harbin (0.04)
- Hubei Province > Wuhan (0.04)
- Indonesia > Bali (0.04)
- Macao (0.04)
- Singapore (0.04)
- China
- North America
- Canada (0.04)
- United States (0.04)
- Asia
- Genre:
- Research Report > New Finding (0.48)
- Technology: