AlcLaM: Arabic Dialectal Language Model

Ahmed, Murtadha, Alfasly, Saghir, Wen, Bo, Qasem, Jamaal, Ahmed, Mohammed, Liu, Yunfeng

Jul-17-2024–arXiv.org Artificial Intelligence

These models significantly enhance Arabic Pre-trained Language Models (PLMs) utilizing selfsupervised NLP tasks over multilingual models. However, learning techniques, such as BERT (Devlin they are predominantly trained on Modern Standard et al., 2018a) and RoBERTa (Liu et al., 2019), Arabic (MSA) datasets. This focus on MSA have become pivotal in advancing the field of introduces two primary limitations: first, there is natural language processing (NLP) through transfer reduced recognition of dialectal tokens, which vary learning. These models have significantly enhanced widely across different Arabic-speaking regions; performance across a variety of NLP tasks second, there is a biased weighting towards MSA by leveraging vast amounts of textual data and extensive tokens in the models, which may not accurately computational resources. However, the necessity reflect the linguistic nuances present in everyday for large corpora and the substantial computational Arabic usage.

computational linguistic, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

Jul-17-2024

arXiv.org PDF

Add feedback

Country:
- Africa > Middle East (0.28)
- Europe (1.00)
- North America > United States
  - California (0.14)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.47)
  - Natural Language
    - Chatbot (0.68)
    - Large Language Model (0.47)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found