Small Languages, Big Models: A Study of Continual Training on Languages of Norway
Samuel, David, Mikhailov, Vladislav, Velldal, Erik, Øvrelid, Lilja, Charpentier, Lucas Georges Gabriel, Kutuzov, Andrey
–arXiv.org Artificial Intelligence
This method vast amounts of data, posing a challenge enables us to train an 11.4B parameter model that for less widely spoken languages like Norwegian achieves state-of-the-art performance across Norwegian and even more so for truly lowresource language tasks while maintaining strong languages like Sámi. To address capabilities in Northern Sámi. The three main research this issue, we present a novel three-stage contributions of this paper can be summarized continual training approach. We also experiment as follows: with combining causal and masked 1. Novel training method for data-constrained language modeling to get more flexible language models We propose a three-stage models. Based on our findings, we train, training method for efficient adaptation of existing evaluate, and openly release a new large language models to lower-resource languages.
arXiv.org Artificial Intelligence
Dec-9-2024
- Country:
- Asia (0.93)
- Europe (1.00)
- North America > United States
- Minnesota (0.28)
- Genre:
- Research Report > New Finding (0.87)
- Technology: