Goto

Collaborating Authors

 classical music



The Spheres Dataset: Multitrack Orchestral Recordings for Music Source Separation and Information Retrieval

Garcia-Martinez, Jaime, Diaz-Guerra, David, Anderson, John, Falcon-Perez, Ricardo, Cabañas-Molero, Pablo, Virtanen, Tuomas, Carabias-Orti, Julio J., Vera-Candeas, Pedro

arXiv.org Artificial Intelligence

This paper introduces The Spheres dataset, multitrack orchestral recordings designed to advance machine learning research in music source separation and related MIR tasks within the classical music domain. The dataset is composed of over one hour recordings of musical pieces performed by the Colibrì Ensemble at The Spheres recording studio, capturing two canonical works - Tchaikovsky's Romeo and Juliet and Mozart's Symphony No. 40 - along with chromatic scales and solo excerpts for each instrument. The recording setup employed 23 microphones, including close spot, main, and ambient microphones, enabling the creation of realistic stereo mixes with controlled bleeding and providing isolated stems for supervised training of source separation models. In addition, room impulse responses were estimated for each instrument position, offering valuable acoustic characterization of the recording space. We present the dataset structure, acoustic analysis, and baseline evaluations using X-UMX based models for orchestral family separation and microphone debleeding. Results highlight both the potential and the challenges of source separation in complex orchestral scenarios, underscoring the dataset's value for benchmarking and for exploring new approaches to separation, localization, dereverberation, and immersive rendering of classical music.


BACHI: Boundary-Aware Symbolic Chord Recognition Through Masked Iterative Decoding on Pop and Classical Music

Yao, Mingyang, Chen, Ke, Dubnov, Shlomo, Berg-Kirkpatrick, Taylor

arXiv.org Artificial Intelligence

Automatic chord recognition (ACR) via deep learning models has gradually achieved promising recognition accuracy, yet two key challenges remain. First, prior work has primarily focused on audio-domain ACR, while symbolic music (e.g., score) ACR has received limited attention due to data scarcity. Second, existing methods still overlook strategies that are aligned with human music analytical practices. To address these challenges, we make two contributions: (1) we introduce POP909-CL, an enhanced version of POP909 dataset with tempo-aligned content and human-corrected labels of chords, beats, keys, and time signatures; and (2) We propose BACHI, a symbolic chord recognition model that decomposes the task into different decision steps, namely boundary detection and iterative ranking of chord root, quality, and bass (inversion). This mechanism mirrors the human ear-training practices. Experiments demonstrate that BACHI achieves state-of-the-art chord recognition performance on both classical and pop music benchmarks, with ablation studies validating the effectiveness of each module.


Source Separation of Small Classical Ensembles: Challenges and Opportunities

Roa-Dabike, Gerardo, Cox, Trevor J., Barker, Jon P., Akeroyd, Michael A., Bannister, Scott, Fazenda, Bruno, Firth, Jennifer, Graetzer, Simone, Greasley, Alinka, Vos, Rebecca R., Whitmer, William M.

arXiv.org Artificial Intelligence

Musical (MSS) source separation of western popular music using non-causal deep learning can be very effective. In contrast, MSS for classical music is an unsolved problem. Classical ensembles are harder to separate than popular music because of issues such as the inherent greater variation in the music; the sparsity of recordings with ground truth for supervised training; and greater ambiguity between instruments. The Cadenza project has been exploring MSS for classical music. This is being done so music can be remixed to improve listening experiences for people with hearing loss. To enable the work, a new database of synthesized woodwind ensembles was created to overcome instrumental imbalances in the EnsembleSet. For the MSS, a set of ConvTasNet models was used with each model being trained to extract a string or woodwind instrument. ConvTasNet was chosen because it enabled both causal and non-causal approaches to be tested. Non-causal approaches have dominated MSS work and are useful for recorded music, but for live music or processing on hearing aids, causal signal processing is needed. The MSS performance was evaluated on the two small datasets (Bach10 and URMP) of real instrument recordings where the ground-truth is available. The performances of the causal and non-causal systems were similar. Comparing the average Signal-to-Distortion (SDR) of the synthesized validation set (6.2 dB causal; 6.9 non-causal), to the real recorded evaluation set (0.3 dB causal, 0.4 dB non-causal), shows that mismatch between synthesized and recorded data is a problem. Future work needs to either gather more real recordings that can be used for training, or to improve the realism and diversity of the synthesized recordings to reduce the mismatch...


Exploratory Study Of Human-AI Interaction For Hindustani Music

Shikarpur, Nithya, Huang, Cheng-Zhi Anna

arXiv.org Artificial Intelligence

This paper presents a study of participants interacting with and using GaMaDHaNi, a novel hierarchical generative model for Hindustani vocal contours. To explore possible use cases in human-AI interaction, we conducted a user study with three participants, each engaging with the model through three predefined interaction modes. Although this study was conducted "in the wild"-- with the model unadapted for the shift from the training data to real-world interaction -- we use it as a pilot to better understand the expectations, reactions, and preferences of practicing musicians when engaging with such a model. We note their challenges as (1) the lack of restrictions in model output, and (2) the incoherence of model output. We situate these challenges in the context of Hindustani music and aim to suggest future directions for the model design to address these gaps.


Simulation of Neural Responses to Classical Music Using Organoid Intelligence Methods

Szelogowski, Daniel

arXiv.org Artificial Intelligence

Music is a complex auditory stimulus capable of eliciting significant changes in brain activity, influencing cognitive processes such as memory, attention, and emotional regulation. However, the underlying mechanisms of music-induced cognitive processes remain largely unknown. Organoid intelligence and deep learning models show promise for simulating and analyzing these neural responses to classical music, an area significantly unexplored in computational neuroscience. Hence, we present the PyOrganoid library, an innovative tool that facilitates the simulation of organoid learning models, integrating sophisticated machine learning techniques with biologically inspired organoid simulations. Our study features the development of the Pianoid model, a "deep organoid learning" model that utilizes a Bidirectional LSTM network to predict EEG responses based on audio features from classical music recordings. This model demonstrates the feasibility of using computational methods to replicate complex neural processes, providing valuable insights into music perception and cognition. Likewise, our findings emphasize the utility of synthetic models in neuroscience research and highlight the PyOrganoid library's potential as a versatile tool for advancing studies in neuroscience and artificial intelligence.


Can AI Emulate Human Creativity?

#artificialintelligence

If you work out of an office, you know that the coffee machine is the favorite spot in the office to hang out or have conversations at. From giving us the first cup of the day to keeping us awake for late-night meetings, that machine is a lifesaver. But just for a day, try not getting your coffee from the coffee machine. Don't skip coffee entirely, but instead, go out to your local coffee shop that doesn't use coffee machines or make yourself a flask at home. You will realize that hand-made coffee is inherently better than the one that is made from a machine.


Why Improvisation Is the Future in an AI-Dominated World

#artificialintelligence

In his autobiography, Miles Davis complained that classical musicians were like robots. He spoke from experience – he'd studied classical music at Juilliard and recorded with classical musicians even after becoming a world-renowned jazz artist. As a music professor at the University of Florida, which is transforming itself into an "AI university," I often think about Davis' words, and the ways in which musicians have become more machinelike over the past century. At the same time, I see how machines have been getting better at mimicking human improvisation, in all aspects of life. I wonder what the limits of machine improvisation will be, and which human activities will survive the rise of intelligent machines.


Machine learning helps retrace evolution of classical music

AIHub

Researchers in EPFL's Digital and Cognitive Musicology Lab used an unsupervised machine learning model to "listen to" and categorize more than 13,000 pieces of Western classical music, revealing how modes – such as major and minor – have changed throughout history. Many people may not be able to define what a minor mode is in music, but most would almost certainly recognize a piece played in a minor key. That's because we intuitively differentiate the set of notes belonging to the minor scale – which tend to sound dark, tense, or sad – from those in the major scale, which more often connote happiness, strength, or lightness. But throughout history, there have been periods when multiple other modes were used in addition to major and minor – or when no clear separation between modes could be found at all. Understanding and visualizing these differences over time is what Digital and Cognitive Musicology Lab (DCML) researchers Daniel Harasim, Fabian Moss, Matthias Ramirez, and Martin Rohrmeier set out to do in a recent study, which has been published in the open-access journal Humanities and Social Sciences Communications.


Classical music can help us perform better in exams, study reveals

Daily Mail - Science & tech

Listening to classical music during lectures and throughout the night while sleeping may help us perform better in big exams, a new study suggests. US economics students who listened to Beethoven and Chopin during a lecture and again later in the night performed 18 per cent higher in exams the next day. This compared with a control group of students who were in the same lecture but slept that night with white noise on in the background. Researchers say that classical music activated a process called'targeted memory reactivation' (TMR), when the music stimulates the brain to consolidate memories. The study suggests classical music is the key to strengthening existing memories of lectures during sleep and, as a result, doing better in exams.