MultiProSE: A Multi-label Arabic Dataset for Propaganda, Sentiment, and Emotion Detection
Al-Henaki, Lubna, Al-Khalifa, Hend, Al-Salman, Abdulmalik, Alqubayshi, Hajar, Al-Twailay, Hind, Alghamdi, Gheeda, Aljasim, Hawra
–arXiv.org Artificial Intelligence
Propaganda is a form of persuasion that has been used throughout history with the intention goal of influencing people's opinions through rhetorical and psychological persuasion techniques for determined ends. Although Arabic ranked as the fourth most- used language on the internet, resources for propaganda detection in languages other than English, especially Arabic, remain extremely limited. To address this gap, the first Arabic dataset for Multi-label Propaganda, Sentiment, and Emotion (MultiProSE) has been introduced. MultiProSE is an open-source extension of the existing Arabic propaganda dataset, ArPro, with the addition of sentiment and emotion annotations for each text. This dataset comprises 8,000 annotated news articles, which is the largest propaganda dataset to date. For each task, several baselines have been developed using large language models (LLMs), such as GPT-4o-mini, and pre-trained language models (PLMs), including three BERT-based models. The dataset, annotation guidelines, and source code are all publicly released to facilitate future research and development in Arabic language models and contribute to a deeper understanding of how various opinion dimensions interact in news media1.
arXiv.org Artificial Intelligence
Feb-12-2025
- Country:
- North America
- United States
- Washington > King County
- Seattle (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- California > Ventura County
- Thousand Oaks (0.04)
- Washington > King County
- Canada > Ontario
- Toronto (0.04)
- United States
- Europe
- Bulgaria (0.04)
- United Kingdom > England (0.04)
- Switzerland (0.04)
- Italy > Tuscany
- Florence (0.04)
- France > Grand Est
- Meurthe-et-Moselle > Nancy (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- Czechia > South Moravian Region
- Brno (0.04)
- Asia
- Singapore (0.04)
- Middle East
- Qatar (0.04)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.04)
- Saudi Arabia > Riyadh Province
- Riyadh (0.04)
- China
- North America
- Genre:
- Research Report (0.83)
- Industry:
- Government (1.00)
- Media > News (0.93)
- Technology: