Improving Natural Language Inference in Arabic using Transformer Models and Linguistically Informed Pre-Training

Deen, Mohammad Majd Saad Al, Pielka, Maren, Hees, Jörn, Abdou, Bouthaina Soulef, Sifa, Rafet

Jul-27-2023–arXiv.org Artificial Intelligence

This paper addresses the classification of Arabic text data in the field of Natural Language Processing (NLP), with a particular focus on Natural Language Inference (NLI) and Contradiction Detection (CD). Arabic is considered a resource-poor language, meaning that there are few data sets available, which leads to limited availability of NLP methods. To overcome this limitation, we create a dedicated data set from publicly available resources. Subsequently, transformer-based machine learning models are being trained and evaluated. We find that a language-specific model (AraBERT) performs competitively with state-of-the-art multilingual approaches, when we apply linguistically informed pre-training methods such as Named Entity Recognition (NER). To our knowledge, this is the first large-scale evaluation for this task in Arabic, as well as the first application of multi-task pre-training in this context.

language inference, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Jul-27-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Ohio > Franklin County > Columbus (0.04)
- Europe
  - Germany > North Rhine-Westphalia (0.04)
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)
  - Portugal > Lisbon
    - Lisbon (0.04)
  - France > Provence-Alpes-Côte d'Azur
    - Bouches-du-Rhône > Marseille (0.04)
  - Denmark > Capital Region
    - Copenhagen (0.04)
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)

Genre:
- Research Report (1.00)

Industry:
- Education (0.96)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Text Processing (1.00)
  - Machine Learning
    - Statistical Learning (0.94)
    - Neural Networks > Deep Learning (0.89)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found