EmoTale: An Enacted Speech-emotion Dataset in Danish

Hjuler, Maja J., Skat-Rørdam, Harald V., Clemmensen, Line H., Das, Sneha

arXiv.org Artificial Intelligence 

--While multiple emotional speech corpora exist for commonly spoken languages, there is a lack of functional datasets for smaller (spoken) languages, such as Danish. T o our knowledge, Danish Emotional Speech (DES), published in 1997, is the only other database of Danish emotional speech. We demonstrate the validity of the dataset by investigating and presenting its predictive power using speech emotion recognition (SER) models. We develop SER models for EmoT ale and the reference datasets using self-supervised speech model (SSLM) embeddings and the openSMILE feature extractor . We find the embeddings superior to the hand-crafted features. The best model achieves an unweighted average recall (UAR) of 64.1% on the EmoT ale corpus using leave-one-speaker-out cross-validation, comparable to the performance on DES. Speech signals are rich in information, both linguistic (in the form of sentences and words) and paralinguistic (denoting mood and affective state). Speech also carries information about multiple, potentially personal traits of the speaker, such as age, gender, and nationality. Multiple psychological and neuroscientific models of the mind hypothesize that language and emotion are certainly linked [1]. For example, some cultures express anger more vocally, while others might be more restrained.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found