AlcLaM: Arabic Dialectal Language Model

Ahmed, Murtadha, Alfasly, Saghir, Wen, Bo, Qasem, Jamaal, Ahmed, Mohammed, Liu, Yunfeng

arXiv.org Artificial Intelligence 

These models significantly enhance Arabic Pre-trained Language Models (PLMs) utilizing selfsupervised NLP tasks over multilingual models. However, learning techniques, such as BERT (Devlin they are predominantly trained on Modern Standard et al., 2018a) and RoBERTa (Liu et al., 2019), Arabic (MSA) datasets. This focus on MSA have become pivotal in advancing the field of introduces two primary limitations: first, there is natural language processing (NLP) through transfer reduced recognition of dialectal tokens, which vary learning. These models have significantly enhanced widely across different Arabic-speaking regions; performance across a variety of NLP tasks second, there is a biased weighting towards MSA by leveraging vast amounts of textual data and extensive tokens in the models, which may not accurately computational resources. However, the necessity reflect the linguistic nuances present in everyday for large corpora and the substantial computational Arabic usage.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found