From melodic note sequences to pitches using word2vec
–arXiv.org Artificial Intelligence
Applying the word2vec technique, commonly used in language modeling, to melodies--where notes are treated as words in sentences--enables the capture of pitch information. This study examines two datasets: 20 children's songs and an excerpt from a Bach sonata. The semantic space for defining the embeddings is of very small dimension, specifically 2. Notes are predicted based on the 2, 3 or 4 preceding notes that establish the context. A multivariate analysis of the results shows that the semantic vectors representing the notes have a multiple correlation coefficient of approximately 0.80 with their pitches. Keywords Embedding; Machine Learning; Semantic meaning; Correlation 1. Introduction What kind of meaning can we capture from musical notes using word embedding techniques typically applied in language models? This study addresses this question by modeling various types of music with a relatively simple neural network, commonly used for word embedding. An embedding is a vector representation of an entity (a word, an image, a sound) in a multidimensional space where geometric relationships between vectors reflect semantic relationships between the corresponding entities (Chollet, 2021). This inquiry is not new; numerous statistical and computational models, including neural networks, have been proposed to capture key features of musical pieces and to model music perception. In 2016, Madjiheurem, Qu and Walder compared different embedding techniques to learn musical chord embeddings.
arXiv.org Artificial Intelligence
Oct-29-2024
- Country:
- Europe > Austria
- Upper Austria > Linz (0.04)
- Vienna (0.14)
- Europe > Austria
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Leisure & Entertainment (1.00)
- Media > Music (1.00)
- Technology: