HausaNLP at SemEval-2023 Task 10: Transfer Learning, Synthetic Data and Side-Information for Multi-Level Sexism Classification

Aliyu, Saminu Mohammad, Abdulmumin, Idris, Muhammad, Shamsuddeen Hassan, Ahmad, Ibrahim Said, Salahudeen, Saheed Abdullahi, Yusuf, Aliyu, Lawan, Falalu Ibrahim

Apr-28-2023–arXiv.org Artificial Intelligence

We present the findings of our participation in the SemEval-2023 Task 10: Explainable Detection of Online Sexism (EDOS) task, a shared task on offensive language (sexism) detection on English Gab and Reddit dataset. We investigated the effects of transferring two language models: XLM-T (sentiment classification) and HateBERT (same domain -- Reddit) for multi-level classification into Sexist or not Sexist, and other subsequent sub-classifications of the sexist data. We also use synthetic classification of unlabelled dataset and intermediary class information to maximize the performance of our models. We submitted a system in Task A, and it ranked 49th with F1-score of 0.82. This result showed to be competitive as it only under-performed the best system by 0.052% F1-score.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Apr-28-2023

arXiv.org PDF

Add feedback

Country:
- North America
  - United States > California
    - San Diego County > San Diego (0.04)
  - Canada > Ontario
    - Toronto (0.04)
- Europe > France
  - Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
- Africa > Nigeria
  - Kaduna State > Kaduna (0.04)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Media > News (0.56)

Technology:
- Information Technology
  - Communications > Social Media (1.00)
  - Artificial Intelligence
    - Natural Language > Information Extraction (0.48)
    - Machine Learning
      - Neural Networks > Deep Learning (0.71)
      - Statistical Learning (0.69)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found