RobBERT-2022: Updating a Dutch Language Model to Account for Evolving Language Use

Delobelle, Pieter, Winters, Thomas, Berendt, Bettina

Nov-15-2022–arXiv.org Artificial Intelligence

Large transformer-based language models, e.g. BERT and GPT-3, outperform previous architectures on most natural language processing tasks. Such language models are first pre-trained on gigantic corpora of text and later used as base-model for finetuning on a particular task. Since the pre-training step is usually not repeated, base models are not up-to-date with the latest information. In this paper, we update RobBERT, a RoBERTa-based state-of-the-art Dutch language model, which was trained in 2019. First, the tokenizer of RobBERT is updated to include new high-frequent tokens present in the latest Dutch OSCAR corpus, e.g. corona-related words. Then we further pre-train the RobBERT model using this dataset. To evaluate if our new model is a plug-in replacement for RobBERT, we introduce two additional criteria based on concept drift of existing tokens and alignment for novel tokens.We found that for certain language tasks this update results in a significant performance increase. These results highlight the benefit of continually updating a language model to account for evolving language use.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Nov-15-2022

arXiv.org PDF

Add feedback

Country:
- North America
  - Dominican Republic (0.04)
  - United States > Minnesota
    - Hennepin County > Minneapolis (0.14)
- Europe
  - Germany > Berlin (0.04)
  - Finland (0.04)
  - Netherlands
    - North Brabant > Eindhoven (0.04)
    - Limburg > Maastricht (0.04)
    - Gelderland > Nijmegen (0.04)
  - Belgium > Flanders
    - Flemish Brabant > Leuven (0.04)

Genre:
- Research Report (0.64)

Industry:
- Health & Medicine > Therapeutic Area > Immunology (0.31)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (0.68)
  - Machine Learning > Neural Networks
    - Deep Learning (0.88)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found