Goto

Collaborating Authors

 chorale


Computational music analysis from first principles

Tymoczko, Dmitri, Newman, Mark

arXiv.org Artificial Intelligence

We use coupled hidden Markov models to automatically annotate the 371 Bach chorales in the Riemenschneider edition, a corpus containing approximately 100,000 notes and 20,000 chords. We give three separate analyses that achieve progressively greater accuracy at the cost of making increasingly strong assumptions about musical syntax. Although our method makes almost no use of human input, we are able to identify both chords and keys with an accuracy of 85% or greater when compared to an expert human analysis, resulting in annotations accurate enough to be used for a range of music-theoretical purposes, while also being free of subjective human judgments. Our work bears on longstanding debates about the objective reality of the structures postulated by standard Western harmonic theory, as well as on specific questions about the nature of Western harmonic syntax.


Exploring Latent Spaces of Tonal Music using Variational Autoencoders

Carvalho, Nádia, Bernardes, Gilberto

arXiv.org Artificial Intelligence

Variational Autoencoders (VAEs) have proven to be effective models for producing latent representations of cognitive and semantic value. We assess the degree to which VAEs trained on a prototypical tonal music corpus of 371 Bach's chorales define latent spaces representative of the circle of fifths and the hierarchical relation of each key component pitch as drawn in music cognition. In detail, we compare the latent space of different VAE corpus encodings -- Piano roll, MIDI, ABC, Tonnetz, DFT of pitch, and pitch class distributions -- in providing a pitch space for key relations that align with cognitive distances. We evaluate the model performance of these encodings using objective metrics to capture accuracy, mean square error (MSE), KL-divergence, and computational cost. The ABC encoding performs the best in reconstructing the original data, while the Pitch DFT seems to capture more information from the latent space. Furthermore, an objective evaluation of 12 major or minor transpositions per piece is adopted to quantify the alignment of 1) intra- and inter-segment distances per key and 2) the key distances to cognitive pitch spaces. Our results show that Pitch DFT VAE latent spaces align best with cognitive spaces and provide a common-tone space where overlapping objects within a key are fuzzy clusters, which impose a well-defined order of structural significance or stability -- i.e., a tonal hierarchy. Tonal hierarchies of different keys can be used to measure key distances and the relationships of their in-key components at multiple hierarchies (e.g., notes and chords). The implementation of our VAE and the encodings framework are made available online.


Chord-Conditioned Melody Choralization with Controllable Harmonicity and Polyphonicity

Wu, Shangda, Li, Xiaobing, Sun, Maosong

arXiv.org Artificial Intelligence

Melody choralization, i.e. generating a four-part chorale based on a user-given melody, has long been closely associated with J.S. Bach chorales. Previous neural network-based systems rarely focus on chorale generation conditioned on a chord progression, and none of them realised controllable melody choralization. To enable neural networks to learn the general principles of counterpoint from Bach's chorales, we first design a music representation that encoded chord symbols for chord conditioning. We then propose DeepChoir, a melody choralization system, which can generate a four-part chorale for a given melody conditioned on a chord progression. Furthermore, with the improved density sampling, a user can control the extent of harmonicity and polyphonicity for the chorale generated by DeepChoir. Experimental results reveal the effectiveness of our data representation and the controllability of DeepChoir over harmonicity and polyphonicity. The code and generated samples (chorales, folk songs and a symphony) of DeepChoir, and the dataset we use now are available at https://github.com/sander-wood/deepchoir.


Exploring Graph Representation of Chorales

Phon-Amnuaisuk, Somnuk

arXiv.org Artificial Intelligence

This work explores areas overlapping music, graph theory, and machine learning. An embedding representation of a node, in a weighted undirected graph $\mathcal{G}$, is a representation that captures the meaning of nodes in an embedding space. In this work, 383 Bach chorales were compiled and represented as a graph. Two application cases were investigated in this paper (i) learning node embedding representation using \emph{Continuous Bag of Words (CBOW), skip-gram}, and \emph{node2vec} algorithms, and (ii) learning node labels from neighboring nodes based on a collective classification approach. The results of this exploratory study ascertains many salient features of the graph-based representation approach applicable to music applications.


GitHub - AI-Guru/music-generation-research: A straightforward collection of Music Generation research resources.

#artificialintelligence

This thesis investigates Bach's composition style using deep sequence learning. We develop BachBot: an automatic stylistic composition system for composing polyphonic music in the style of Bach's chorales. We find a 3-layer stacked LSTM performs best and conduct analyses and evaluations to understand its success and failure modes. Unlike many previous works, we avoid allowing prior assumptions about music impact model design, opting instead to build systems that learn rather than ones which encode prior hypotheses. While this is not the first application of deep LSTM to Bach chorales, our work consists of the following novel contributions.


BacHMMachine: An Interpretable and Scalable Model for Algorithmic Harmonization for Four-part Baroque Chorales

Zhu, Yunyao, Hahn, Stephen, Mak, Simon, Jiang, Yue, Rudin, Cynthia

arXiv.org Machine Learning

Algorithmic harmonization - the automated harmonization of a musical piece given its melodic line - is a challenging problem that has garnered much interest from both music theorists and computer scientists. One genre of particular interest is the four-part Baroque chorales of J.S. Bach. Methods for algorithmic chorale harmonization typically adopt a black-box, "data-driven" approach: they do not explicitly integrate principles from music theory but rely on a complex learning model trained with a large amount of chorale data. We propose instead a new harmonization model, called BacHMMachine, which employs a "theory-driven" framework guided by music composition principles, along with a "data-driven" model for learning compositional features within this framework. As its name suggests, BacHMMachine uses a novel Hidden Markov Model based on key and chord transitions, providing a probabilistic framework for learning key modulations and chordal progressions from a given melodic line. This allows for the generation of creative, yet musically coherent chorale harmonizations; integrating compositional principles allows for a much simpler model that results in vast decreases in computational burden and greater interpretability compared to state-of-the-art algorithmic harmonization methods, at no penalty to quality of harmonization or musicality. We demonstrate this improvement via comprehensive experiments and Turing tests comparing BacHMMachine to existing methods.


Incorporating Music Knowledge in Continual Dataset Augmentation for Music Generation

Liu, Alisa, Fang, Alexander, Hadjeres, Gaëtan, Seetharaman, Prem, Pardo, Bryan

arXiv.org Machine Learning

Deep learning has rapidly become the state-of-the-art approach for music generation. However, training a deep model typically requires a large training set, which is often not available for specific musical styles. In this paper, we present augmentative generation (Aug-Gen), a method of dataset augmentation for any music generation system trained on a resource-constrained domain. The key intuition of this method is that the training data for a generative system can be augmented by examples the system produces during the course of training, provided these examples are of sufficiently high quality and variety. We apply Aug-Gen to Transformer-based chorale generation in the style of J.S. Bach, and show that this allows for longer training and results in better generative output.


Bach or Mock? A Grading Function for Chorales in the Style of J.S. Bach

Fang, Alexander, Liu, Alisa, Seetharaman, Prem, Pardo, Bryan

arXiv.org Machine Learning

Deep generative systems that learn probabilistic models from a corpus of existing music do not explicitly encode knowledge of a musical style, compared to traditional rule-based systems. Thus, it can be difficult to determine whether deep models generate stylistically correct output without expert evaluation, but this is expensive and time-consuming. Therefore, there is a need for automatic, interpretable, and musically-motivated evaluation measures of generated music. In this paper, we introduce a grading function that evaluates four-part chorales in the style of J.S. Bach along important musical features. We use the grading function to evaluate the output of a Transformer model, and show that the function is both interpretable and outperforms human experts at discriminating Bach chorales from model-generated ones.


Sara Adkins Is Jamming Out With Machines

#artificialintelligence

Asking machines to make music by themselves is kind of a strange notion. They don't feel happy or hurt, and as far as we know, they don't long for the affections of other machines. Humans like to think of music as being a strictly human thing, a passionate undertaking so nuanced and emotion-based that a machine could never begin to understand the feeling that goes into the process of making music, or even the simple enjoyment of it. The idea of humans and machines having a jam session together is even stranger. But oddly enough, the principles of the jam session may be exactly what machines need to begin to understand musical expression.


Improving Polyphonic Music Models with Feature-Rich Encoding

Peracha, Omar

arXiv.org Machine Learning

This paper explores sequential modeling of polyphonic music with deep neural networks. While recent breakthroughs have focussed on network architecture, we demonstrate that the representation of the sequence can make an equally significant contribution to the performance of the model as measured by validation set loss. By extracting salient features inherent to the dataset, the model can either be conditioned on these features or trained to predict said features as extra components of the sequences being modeled. We show that training a neural network to predict a seemingly more complex sequence, with extra features included in the series being modeled, can improve overall model performance significantly. We first introduce TonicNet, a GRU-based model trained to initially predict the chord at a given time-step before then predicting the notes of each voice at that time-step, in contrast with the typical approach of predicting only the notes. We then evaluate TonicNet on the canonical JSB Chorales dataset and obtain state-of-the-art results.