Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?

Kew, Tannon, Schottmann, Florian, Sennrich, Rico

Dec-19-2023–arXiv.org Artificial Intelligence

The vast majority of today's large language models are English-centric, having been pretrained predominantly on English text. Yet, in order to meet user expectations, models need to be able to respond appropriately in multiple languages once deployed in downstream applications. Given limited exposure to other languages during pretraining, crosslingual transfer is important for achieving decent performance in non-English settings. In this work, we investigate just how much multilinguality is required during finetuning to elicit strong cross-lingual generalisation across Figure 1: Input/output (IO) language agreement for a range of tasks and target languages. We find English (en), German (de), Bulgarian (bg) and Icelandic that, compared to English-only finetuning, multilingual (is) when instruction tuning on monolingual English instruction tuning with as few as three (Mono) or on multilingual data (Multi-Guanaco).

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Dec-19-2023

arXiv.org PDF

Add feedback

Country:
- Asia > Middle East (0.46)
- Europe (1.00)
- North America > United States
  - Minnesota > Hennepin County > Minneapolis (0.14)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Leisure & Entertainment > Sports > Football (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.97)
  - Natural Language
    - Chatbot (1.00)
    - Large Language Model (1.00)