Pretraining Finnish ModernBERTs
Reunamo, Akseli, Peltonen, Laura-Maria, Moen, Hans, Pyysalo, Sampo
–arXiv.org Artificial Intelligence
This paper reports on pretraining ModernBERT encoder models in six different sizes, ranging from 51M to 475M parameters, with a focus on limited multilingualism, emphasizing languages relevant to Finland. Our models are competitive with, or superior to, existing multilingual models. They outperform monolingual models on tasks that require a context longer than 512 tokens. We present empirical results on using different data in the final stage of training. The code and models are publicly released.
arXiv.org Artificial Intelligence
Nov-13-2025
- Country:
- Asia
- China > Hong Kong (0.04)
- Middle East > Republic of Türkiye
- Istanbul Province > Istanbul (0.04)
- Singapore (0.04)
- Thailand > Bangkok
- Bangkok (0.04)
- Europe
- Faroe Islands > Streymoy
- Tórshavn (0.04)
- Croatia > Dubrovnik-Neretva County
- Dubrovnik (0.04)
- Middle East > Republic of Türkiye
- Istanbul Province > Istanbul (0.04)
- Sweden > Östergötland County
- Linköping (0.04)
- Estonia
- Harju County > Tallinn (0.04)
- Tartu County > Tartu (0.04)
- Norway (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Finland
- Northern Savo > Kuopio (0.04)
- Southwest Finland > Turku (0.04)
- Uusimaa > Helsinki (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Iceland > Capital Region
- Reykjavik (0.04)
- Austria > Vienna (0.14)
- Faroe Islands > Streymoy
- North America
- Canada > Ontario
- Toronto (0.04)
- United States
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Texas > Travis County
- Austin (0.04)
- Washington > King County
- Seattle (0.04)
- Minnesota > Hennepin County
- Canada > Ontario
- Asia
- Genre:
- Research Report (1.00)
- Technology: