Domain-Adaptive Pre-Training for Arabic Aspect-Based Sentiment Analysis: A Comparative Study of Domain Adaptation and Fine-Tuning Strategies
Alyami, Salha, Jamal, Amani, Alhothali, Areej
–arXiv.org Artificial Intelligence
Aspect-based sentiment analysis (ABSA) in natural language processing enables organizations to understand customer opinions on specific product aspects. While deep learning models are widely used for English ABSA, their application in Arabic is limited due to the scarcity of labeled data. Researchers have attempted to tackle this issue by using pre-trained contextualized language models such as BERT. However, these models are often based on fact-based data, which can introduce bias in domain-specific tasks like ABSA. To our knowledge, no studies have applied adaptive pre-training with Arabic contextualized models for ABSA. This research proposes a novel approach using domain-adaptive pre-training for aspect-sentiment classification (ASC) and opinion target expression (OTE) extraction. We examine fine-tuning strategies - feature extraction, full fine-tuning, and adapter-based methods - to enhance performance and efficiency, utilizing multiple adaptation corpora and contextualized models. Our results show that in-domain adaptive pre-training yields modest improvements. Adapter-based fine-tuning is a computationally efficient method that achieves competitive results. However, error analyses reveal issues with model predictions and dataset labeling. In ASC, common problems include incorrect sentiment labeling, misinterpretation of contrastive markers, positivity bias for early terms, and challenges with conflicting opinions and subword tokenization. For OTE, issues involve mislabeling targets, confusion over syntactic roles, difficulty with multi-word expressions, and reliance on shallow heuristics. These findings underscore the need for syntax- and semantics-aware models, such as graph convolutional networks, to more effectively capture long-distance relations and complex aspect-based opinion alignments.
arXiv.org Artificial Intelligence
Sep-23-2025
- Country:
- Asia > Middle East
- Saudi Arabia > Mecca Province > Jeddah (0.04)
- Europe
- Bulgaria > Sofia City Province
- Sofia (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Italy > Tuscany
- Florence (0.04)
- Middle East > Malta
- Eastern Region > Northern Harbour District > St. Julian's (0.04)
- Romania > Sud - Muntenia Development Region
- Giurgiu County > Giurgiu (0.04)
- Ukraine > Kyiv Oblast
- Kyiv (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Bulgaria > Sofia City Province
- North America
- Canada > Ontario
- Toronto (0.04)
- Dominican Republic (0.04)
- United States
- California > San Diego County
- San Diego (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Texas > Travis County
- Austin (0.04)
- California > San Diego County
- Canada > Ontario
- Oceania > Australia
- New South Wales > Sydney (0.04)
- Victoria > Melbourne (0.04)
- Asia > Middle East
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Consumer Products & Services (0.46)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning > Neural Networks
- Deep Learning (1.00)
- Natural Language
- Chatbot (0.92)
- Discourse & Dialogue (1.00)
- Information Extraction (1.00)
- Large Language Model (1.00)
- Text Processing (1.00)
- Machine Learning > Neural Networks
- Information Technology > Artificial Intelligence