Multilingual Models for Check-Worthy Social Media Posts Detection

Aug-13-2024–arXiv.org Artificial Intelligence

This work presents an extensive study of transformer-based NLP models for detection of social media posts that contain verifiable factual claims and harmful claims. The study covers various activities, including dataset collection, dataset pre-processing, architecture selection, setup of settings, model training (fine-tuning), model testing, and implementation. The study includes a comprehensive analysis of different models, with a special focus on multilingual models where the same model is capable of processing social media posts in both English and in low-resource languages such as Arabic, Bulgarian, Dutch, Polish, Czech, Slovak. The results obtained from the study were validated against state-of-the-art models, and the comparison demonstrated the robustness of the proposed models. The novelty of this work lies in the development of multi-label multilingual classification models that can simultaneously detect harmful posts and posts that contain verifiable factual claims in an efficient way.

claim detection, dataset, detection, (14 more...)

arXiv.org Artificial Intelligence

Aug-13-2024

arXiv.org PDF

Add feedback

Country:
- Africa > South Africa (0.04)
- Oceania > Australia
  - Victoria > Melbourne (0.04)
- North America
  - United States
    - Colorado (0.04)
    - Washington > King County
      - Seattle (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
  - Canada > Ontario
    - Toronto (0.04)
- Europe
  - Austria > Vienna (0.14)
  - France (0.04)
  - Slovakia > Bratislava
    - Bratislava (0.04)
  - Romania > București - Ilfov Development Region
    - Municipality of Bucharest > Bucharest (0.04)
  - Poland > Kuyavian-Pomeranian Province
    - Bydgoszcz (0.04)
  - Italy > Emilia-Romagna
    - Metropolitan City of Bologna > Bologna (0.04)
- Asia
  - British Indian Ocean Territory > Diego Garcia (0.04)
  - Middle East > Saudi Arabia
    - Asir Province > Abha (0.04)

Genre:
- Research Report (1.00)

Industry:
- Health & Medicine > Therapeutic Area
  - Immunology (0.95)
  - Infections and Infectious Diseases (0.69)

Technology:
- Information Technology
  - Communications > Social Media (1.00)
  - Artificial Intelligence
    - Natural Language > Large Language Model (1.00)
    - Machine Learning > Neural Networks
      - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found