University of Indonesia at SemEval-2025 Task 11: Evaluating State-of-the-Art Encoders for Multi-Label Emotion Detection

Hanif, Ikhlasul Akmal, Yulianrifat, Eryawan Presma, Ongris, Jaycent Gunawan, Tjitrahardja, Eduardus, Azmi, Muhammad Falensi, Naufal, Rahmat Bryan, Wicaksono, Alfan Farizki

May-23-2025–arXiv.org Artificial Intelligence

This paper presents our approach for SemEval 2025 Task 11 Track A, focusing on multilabel emotion classification across 28 languages. We explore two main strategies: fully fine-tuning transformer models and classifier-only training, evaluating different settings such as fine-tuning strategies, model architectures, loss functions, encoders, and classifiers. Our findings suggest that training a classifier on top of prompt-based encoders such as mE5 and BGE yields significantly better results than fully fine-tuning XLMR and mBERT. Our best-performing model on the final leaderboard is an ensemble combining multiple BGE models, where CatBoost serves as the classifier, with different configurations. This ensemble achieves an average F1-macro score of 56.58 across all languages.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

May-23-2025

arXiv.org PDF

Add feedback

Country:
- Africa > Mali (0.04)
- Asia
  - China (0.04)
  - Indonesia (0.04)
  - Japan > Kyūshū & Okinawa
    - Kyūshū > Miyazaki Prefecture > Miyazaki (0.04)
  - Middle East
    - Israel (0.04)
    - Jordan (0.04)
    - Saudi Arabia > Asir Province
      - Abha (0.04)
    - UAE > Abu Dhabi Emirate
      - Abu Dhabi (0.14)
  - Myanmar > Tanintharyi Region
    - Dawei (0.04)
- Europe
  - Austria > Vienna (0.14)
  - Croatia > Dubrovnik-Neretva County
    - Dubrovnik (0.04)
  - Eastern Europe (0.04)
  - Monaco (0.04)
- North America
  - Canada > Ontario
    - Toronto (0.04)
  - Central America (0.04)
  - Mexico > Mexico City
    - Mexico City (0.04)
- South America (0.04)

Genre:
- Research Report > New Finding (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks > Deep Learning (1.00)
    - Statistical Learning (0.93)
  - Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found