Small Languages, Big Models: A Study of Continual Training on Languages of Norway

Samuel, David, Mikhailov, Vladislav, Velldal, Erik, Øvrelid, Lilja, Charpentier, Lucas Georges Gabriel, Kutuzov, Andrey

Dec-9-2024–arXiv.org Artificial Intelligence

This method vast amounts of data, posing a challenge enables us to train an 11.4B parameter model that for less widely spoken languages like Norwegian achieves state-of-the-art performance across Norwegian and even more so for truly lowresource language tasks while maintaining strong languages like Sámi. To address capabilities in Northern Sámi. The three main research this issue, we present a novel three-stage contributions of this paper can be summarized continual training approach. We also experiment as follows: with combining causal and masked 1. Novel training method for data-constrained language modeling to get more flexible language models We propose a three-stage models. Based on our findings, we train, training method for efficient adaptation of existing evaluate, and openly release a new large language models to lower-resource languages.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Dec-9-2024

arXiv.org PDF

Add feedback

Country:
- Asia (0.93)
- Europe (1.00)
- North America > United States
  - Minnesota (0.28)

Genre:
- Research Report > New Finding (0.87)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.93)
  - Natural Language
    - Chatbot (0.88)
    - Large Language Model (1.00)