IntGrad MT: Eliciting LLMs' Machine Translation Capabilities with Sentence Interpolation and Gradual MT
Choi, Seung-Woo, Yoo, Ga-Hyun, Lee, Jay-Yoon
–arXiv.org Artificial Intelligence
Recent Large Language Models (LLMs) have demonstrated strong performance in translation without needing to be finetuned on additional parallel corpora. However, they still underperform for low-resource language pairs. Previous works have focused on mitigating this issue by leveraging relevant few-shot examples or external resources such as dictionaries or grammar books, making models heavily reliant on these nonparametric sources of information. In this paper, we propose a novel method named IntGrad MT that focuses on fully exploiting an LLM's inherent translation capability. IntGrad MT achieves this by constructing a chain of few-shot examples, each consisting of a source sentence and the model's own translation, that rise incrementally in difficulty. IntGrad MT employs two techniques: Sentence Interpolation, which generates a sequence of sentences that gradually change from an easy sentence to translate to a difficult one, and Gradual MT, which sequentially translates this chain using translations of earlier sentences as few-shot examples for the translation of subsequent ones. With this approach, we observe a substantial enhancement in the xCOMET scores of various LLMs for multiple languages, especially in low-resource languages such as Hindi(8.26), Our approach presents a practical way of enhancing LLMs' performance without extra training. Recent Large Language Models (LLMs) have shown strong performance in translation tasks without the need for fine-tuning on specific parallel datasets. Previous studies have demonstrated that LLMs' translation capabilities are reliable in most use cases, particularly when the source and target language are high-resource languages (Zhu et al., 2024; Robinson et al., 2023; Jiao et al., 2023). However, because LLMs require training on large corpora, they still face challenges when translating low-resource languages that are not sufficiently represented in the training corpora.(Stap Previous research has attempted to address these challenges by leveraging the in-context-learning capabilities of large language models (LLMs), particularly through the use of external knowledge such as few-shot examples or dictionaries during inference.
arXiv.org Artificial Intelligence
Oct-15-2024
- Country:
- Asia (1.00)
- North America > Mexico (0.28)
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Health & Medicine (0.46)
- Law (0.46)
- Technology: