Multilingual Models for Check-Worthy Social Media Posts Detection
Kula, Sebastian, Gregor, Michal
–arXiv.org Artificial Intelligence
This work presents an extensive study of transformer-based NLP models for detection of social media posts that contain verifiable factual claims and harmful claims. The study covers various activities, including dataset collection, dataset pre-processing, architecture selection, setup of settings, model training (fine-tuning), model testing, and implementation. The study includes a comprehensive analysis of different models, with a special focus on multilingual models where the same model is capable of processing social media posts in both English and in low-resource languages such as Arabic, Bulgarian, Dutch, Polish, Czech, Slovak. The results obtained from the study were validated against state-of-the-art models, and the comparison demonstrated the robustness of the proposed models. The novelty of this work lies in the development of multi-label multilingual classification models that can simultaneously detect harmful posts and posts that contain verifiable factual claims in an efficient way.
arXiv.org Artificial Intelligence
Aug-13-2024
- Country:
- Africa > South Africa (0.04)
- Asia
- British Indian Ocean Territory > Diego Garcia (0.04)
- Middle East > Saudi Arabia
- Asir Province > Abha (0.04)
- Europe
- Austria > Vienna (0.14)
- France (0.04)
- Italy > Emilia-Romagna
- Metropolitan City of Bologna > Bologna (0.04)
- Poland > Kuyavian-Pomeranian Province
- Bydgoszcz (0.04)
- Romania > București - Ilfov Development Region
- Municipality of Bucharest > Bucharest (0.04)
- Slovakia > Bratislava
- Bratislava (0.04)
- North America
- Canada > Ontario
- Toronto (0.04)
- United States
- Colorado (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Washington > King County
- Seattle (0.04)
- Canada > Ontario
- Oceania > Australia
- Genre:
- Research Report (1.00)
- Industry:
- Technology: