Multi-Modality in Music: Predicting Emotion in Music from High-Level Audio Features and Lyrics
Krols, Tibor, Nikolova, Yana, Oldenburg, Ninell
–arXiv.org Artificial Intelligence
API that makes a wide range of features accessible and therefore open to the public. This paper aims to test whether a multimodal But which features can actually predict the emotion approach for music emotion recognition of a song and how well perform Spotify's (MER) performs better than a unimodal annotations? Building on existing literature presented one on high-level song features in Section 2 we hypothesize that a multimodal and lyrics. We use 11 song features retrieved approach combining high-level auditory from the Spotify API, combined and lyrics-extracted features performs better than lyrics features including sentiment, TF-a uni-modal one (Y.-H. Yang, Lin, Cheng, et al., IDF and Anew to predict valence and 2008; Hu & Downie, 2010b, 2010a). We introduce arousal (Russell, 1980) scores on the our MER model in Section 3 before presenting Deezer Mood Detection Dataset (DMDD) and discussing the results of our exploratory (Delbouys et al., 2018) with 4 different regression and regression experiments in Sections 4 and 5. models.
arXiv.org Artificial Intelligence
Feb-26-2023
- Country:
- Asia (0.04)
- Europe
- Denmark > Capital Region
- Copenhagen (0.04)
- United Kingdom > England
- Oxfordshire > Oxford (0.04)
- Denmark > Capital Region
- North America > United States
- California (0.04)
- New York > New York County
- New York City (0.04)
- Genre:
- Research Report (1.00)
- Industry:
- Leisure & Entertainment (1.00)
- Media > Music (1.00)
- Technology: