EM2LDL: A Multilingual Speech Corpus for Mixed Emotion Recognition through Label Distribution Learning
Li, Xingfeng, Shi, Xiaohan, Li, Junjie, Li, Yongwei, Unoki, Masashi, Toda, Tomoki, Akagi, Masato
–arXiv.org Artificial Intelligence
This study introduces EM2LDL, a novel multilingual speech corpus designed to advance mixed emotion recognition through label distribution learning. Addressing the limitations of predominantly monolingual and single-label emotion corpora \textcolor{black}{that restrict linguistic diversity, are unable to model mixed emotions, and lack ecological validity}, EM2LDL comprises expressive utterances in English, Mandarin, and Cantonese, capturing the intra-utterance code-switching prevalent in multilingual regions like Hong Kong and Macao. The corpus integrates spontaneous emotional expressions from online platforms, annotated with fine-grained emotion distributions across 32 categories. Experimental baselines using self-supervised learning models demonstrate robust performance in speaker-independent gender-, age-, and personality-based evaluations, with HuBERT-large-EN achieving optimal results. By incorporating linguistic diversity and ecological validity, EM2LDL enables the exploration of complex emotional dynamics in multilingual settings. This work provides a versatile testbed for developing adaptive, empathetic systems for applications in affective computing, including mental health monitoring and cross-cultural communication. The dataset, annotations, and baseline codes are publicly available at https://github.com/xingfengli/EM2LDL.
arXiv.org Artificial Intelligence
Nov-26-2025
- Country:
- Asia
- Europe
- Switzerland (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Oxfordshire > Oxford (0.04)
- North America > United States
- New Jersey > Middlesex County > Piscataway (0.04)
- Oceania > Australia
- Australian Capital Territory > Canberra (0.05)
- Genre:
- Overview (0.93)
- Research Report > New Finding (0.93)
- Industry:
- Technology: