On Instruction-Finetuning Neural Machine Translation Models
Raunak, Vikas, Grundkiewicz, Roman, Junczys-Dowmunt, Marcin
–arXiv.org Artificial Intelligence
In this work, we introduce instruction finetuning for Neural Machine Translation (NMT) models, which distills instruction following capabilities from Large Language Models (LLMs) into orders-of-magnitude smaller NMT models. Our instruction-finetuning recipe for NMT models enables customization of translations for a limited but disparate set of translation-specific tasks. We show that NMT models are capable of following multiple instructions simultaneously and demonstrate capabilities of zero-shot composition of instructions. We also show that through instruction finetuning, traditionally disparate tasks such as formality-controlled machine translation, multi-domain adaptation as well as multi-modal translations can be tackled jointly by a single instruction finetuned NMT model, at a performance level comparable to LLMs such as GPT-3.5-Turbo. To the best of our knowledge, our work is among the first to demonstrate the instruction-following capabilities of traditional NMT models, which allows for faster, cheaper and more efficient serving of customized translations.
arXiv.org Artificial Intelligence
Oct-7-2024
- Country:
- Oceania > Australia (0.04)
- North America
- Dominican Republic (0.04)
- United States
- Pennsylvania (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- Canada > Ontario
- Toronto (0.04)
- Europe
- Czechia > Prague (0.05)
- Germany > Berlin (0.04)
- Spain > Valencian Community
- Valencia Province > Valencia (0.04)
- Portugal > Lisbon
- Lisbon (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Asia > Middle East
- UAE (0.04)
- Genre:
- Research Report (0.82)
- Technology: