EBBS: An Ensemble with Bi-Level Beam Search for Zero-Shot Machine Translation

Wen, Yuqiao, Shayegh, Behzad, Huang, Chenyang, Cao, Yanshuai, Mou, Lili

Feb-29-2024–arXiv.org Artificial Intelligence

Machine translation is a widely applicable NLP task that translates a text from a source language to a target language Brown et al. (1990); Bahdanau et al. (2015). The Transformer architecture Vaswani et al. (2017) and pretrained large language models Radford et al. (2019); Raffel et al. (2020); Lewis et al. (2020) have largely improved translation performance, especially in the supervised setting, where a model can learn from large volumes of parallel corpora. However, machine translation remains challenging for low-resource languages, because there are not enough data for large neural networks to learn these languages. We specifically focus on multilingual translation in the zero-shot setting, where the system is required to translate between unseen language pairs. Since collecting parallel data and training individual models for every translation pair are prohibitively expensive, it is common to build a single multilingual system Johnson et al. (2017); Fan et al. (2021) that can perform translation for all language pairs, most of which are zero-shot translation directions with few exceptions (e.g., English). These models work by prepending a language-indicator token, and zero-shot ability emerges as the model generalizes from trained language pairs to unseen ones (Liu et al., 2021; Wicks and Duh, 2022).

large language model, machine learning, translation, (20 more...)

arXiv.org Artificial Intelligence

Feb-29-2024

arXiv.org PDF

Add feedback

Country:
- Asia (0.28)
- North America > Canada
  - Alberta (0.14)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.48)
  - Natural Language
    - Large Language Model (1.00)
    - Machine Translation (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found