Multi-Modality in Music: Predicting Emotion in Music from High-Level Audio Features and Lyrics

Krols, Tibor, Nikolova, Yana, Oldenburg, Ninell

Feb-26-2023–arXiv.org Artificial Intelligence

API that makes a wide range of features accessible and therefore open to the public. This paper aims to test whether a multimodal But which features can actually predict the emotion approach for music emotion recognition of a song and how well perform Spotify's (MER) performs better than a unimodal annotations? Building on existing literature presented one on high-level song features in Section 2 we hypothesize that a multimodal and lyrics. We use 11 song features retrieved approach combining high-level auditory from the Spotify API, combined and lyrics-extracted features performs better than lyrics features including sentiment, TF-a uni-modal one (Y.-H. Yang, Lin, Cheng, et al., IDF and Anew to predict valence and 2008; Hu & Downie, 2010b, 2010a). We introduce arousal (Russell, 1980) scores on the our MER model in Section 3 before presenting Deezer Mood Detection Dataset (DMDD) and discussing the results of our exploratory (Delbouys et al., 2018) with 4 different regression and regression experiments in Sections 4 and 5. models.

artificial intelligence, machine learning, valence, (15 more...)

arXiv.org Artificial Intelligence

Feb-26-2023

arXiv.org PDF

Add feedback

Country:
- Asia (0.04)
- North America > United States
  - California (0.04)
  - New York > New York County
    - New York City (0.04)
- Europe
  - United Kingdom > England
    - Oxfordshire > Oxford (0.04)
  - Denmark > Capital Region
    - Copenhagen (0.04)

Genre:
- Research Report (1.00)

Industry:
- Media > Music (1.00)
- Leisure & Entertainment (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Statistical Learning (1.00)
  - Cognitive Science > Emotion (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found