Tower: An Open Multilingual Large Language Model for Translation-Related Tasks
Alves, Duarte M., Pombal, José, Guerreiro, Nuno M., Martins, Pedro H., Alves, João, Farajian, Amin, Peters, Ben, Rei, Ricardo, Fernandes, Patrick, Agrawal, Sweta, Colombo, Pierre, de Souza, José G. C., Martins, André F. T.
–arXiv.org Artificial Intelligence
Many important tasks within multilingual NLP, such as quality estimation, automatic postedition, or grammatical error correction, involve analyzing, generating or operating with text in multiple languages, and are relevant to various translation workflows -- we call these translation-related tasks. Recently, general-purpose large language models (LLMs) challenged the paradigm of per-task dedicated systems, achieving state-of-the-art performance on several recent WMT shared tasks (Kocmi et al., 2023; Freitag et al., 2023; Neves et al., 2023). Unfortunately, strong capabilities for multiple translation-related tasks have so far been exhibited by closed LLMs only (Hendy et al., 2023; Kocmi & Federmann, 2023; Fernandes et al., 2023; Raunak et al., 2023). Perhaps because most open LLMs are English-centric, approaches leveraging these models still lag behind, having thus far achieved competitive results only when specializing on a single task (Xu et al., 2024a; 2023; Iyer et al., 2023). In this paper, we bridge this gap with a detailed recipe to develop an LLM for multiple translation-related tasks. Our approach, illustrated in Figure 1 and inspired by Xu et al.
arXiv.org Artificial Intelligence
Feb-27-2024
- Country:
- Asia
- Japan > Honshū (0.14)
- Middle East
- Republic of Türkiye (0.14)
- UAE (0.14)
- Europe
- Middle East > Malta (0.14)
- Portugal > Lisbon
- Lisbon (0.14)
- United Kingdom > Scotland (0.14)
- North America > United States
- Maryland (0.14)
- Massachusetts (0.14)
- Asia
- Genre:
- Research Report > New Finding (0.45)
- Technology: