Tower: An Open Multilingual Large Language Model for Translation-Related Tasks

Alves, Duarte M., Pombal, José, Guerreiro, Nuno M., Martins, Pedro H., Alves, João, Farajian, Amin, Peters, Ben, Rei, Ricardo, Fernandes, Patrick, Agrawal, Sweta, Colombo, Pierre, de Souza, José G. C., Martins, André F. T.

Feb-27-2024–arXiv.org Artificial Intelligence

Many important tasks within multilingual NLP, such as quality estimation, automatic postedition, or grammatical error correction, involve analyzing, generating or operating with text in multiple languages, and are relevant to various translation workflows -- we call these translation-related tasks. Recently, general-purpose large language models (LLMs) challenged the paradigm of per-task dedicated systems, achieving state-of-the-art performance on several recent WMT shared tasks (Kocmi et al., 2023; Freitag et al., 2023; Neves et al., 2023). Unfortunately, strong capabilities for multiple translation-related tasks have so far been exhibited by closed LLMs only (Hendy et al., 2023; Kocmi & Federmann, 2023; Fernandes et al., 2023; Raunak et al., 2023). Perhaps because most open LLMs are English-centric, approaches leveraging these models still lag behind, having thus far achieved competitive results only when specializing on a single task (Xu et al., 2024a; 2023; Iyer et al., 2023). In this paper, we bridge this gap with a detailed recipe to develop an LLM for multiple translation-related tasks. Our approach, illustrated in Figure 1 and inspired by Xu et al.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

Feb-27-2024

arXiv.org PDF

Add feedback

Country:
- Asia
  - Japan > Honshū (0.14)
  - Middle East
    - Republic of Türkiye (0.14)
    - UAE (0.14)
- Europe
  - Middle East > Malta (0.14)
  - Portugal > Lisbon
    - Lisbon (0.14)
  - United Kingdom > Scotland (0.14)
- North America > United States
  - Maryland (0.14)
  - Massachusetts (0.14)

Genre:
- Research Report > New Finding (0.45)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found