MaLA-500: Massive Language Adaptation of Large Language Models

Lin, Peiqin, Ji, Shaoxiong, Tiedemann, Jörg, Martins, André F. T., Schütze, Hinrich

Jan-24-2024–arXiv.org Artificial Intelligence

Large language models have advanced the state of the art in natural language processing. However, their predominant design for English or a limited set of languages creates a substantial gap in their effectiveness for low-resource languages. To bridge this gap, we introduce MaLA-500, a novel large language model designed to cover an extensive range of 534 languages. To train MaLA-500, we employ vocabulary extension and continued pretraining on LLaMA 2 with Glot500-c. Our experiments on SIB-200 show that MaLA-500 achieves state-of-the-art in-context learning results. We release MaLA-500 at https://huggingface.co/MaLA-LM

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Jan-24-2024

arXiv.org PDF

Add feedback

Country:
- Europe
  - Belgium (0.14)
  - Finland (0.14)
- North America > Canada (0.14)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language > Large Language Model (1.00)