AlcLaM: Arabic Dialectal Language Model
Ahmed, Murtadha, Alfasly, Saghir, Wen, Bo, Qasem, Jamaal, Ahmed, Mohammed, Liu, Yunfeng
–arXiv.org Artificial Intelligence
These models significantly enhance Arabic Pre-trained Language Models (PLMs) utilizing selfsupervised NLP tasks over multilingual models. However, learning techniques, such as BERT (Devlin they are predominantly trained on Modern Standard et al., 2018a) and RoBERTa (Liu et al., 2019), Arabic (MSA) datasets. This focus on MSA have become pivotal in advancing the field of introduces two primary limitations: first, there is natural language processing (NLP) through transfer reduced recognition of dialectal tokens, which vary learning. These models have significantly enhanced widely across different Arabic-speaking regions; performance across a variety of NLP tasks second, there is a biased weighting towards MSA by leveraging vast amounts of textual data and extensive tokens in the models, which may not accurately computational resources. However, the necessity reflect the linguistic nuances present in everyday for large corpora and the substantial computational Arabic usage.
arXiv.org Artificial Intelligence
Jul-17-2024
- Country:
- Africa > Middle East (0.28)
- Europe (1.00)
- North America > United States
- California (0.14)
- Genre:
- Research Report (1.00)
- Technology: