MaLA-500: Massive Language Adaptation of Large Language Models
Lin, Peiqin, Ji, Shaoxiong, Tiedemann, Jörg, Martins, André F. T., Schütze, Hinrich
–arXiv.org Artificial Intelligence
Large language models have advanced the state of the art in natural language processing. However, their predominant design for English or a limited set of languages creates a substantial gap in their effectiveness for low-resource languages. To bridge this gap, we introduce MaLA-500, a novel large language model designed to cover an extensive range of 534 languages. To train MaLA-500, we employ vocabulary extension and continued pretraining on LLaMA 2 with Glot500-c. Our experiments on SIB-200 show that MaLA-500 achieves state-of-the-art in-context learning results. We release MaLA-500 at https://huggingface.co/MaLA-LM
arXiv.org Artificial Intelligence
Jan-24-2024