Music Transcription Based on Bayesian Piece-Specific Score Models Capturing Repetitions

Nakamura, Eita, Yoshii, Kazuyoshi

arXiv.org Artificial Intelligence 

YY, ZZZZ 1 Music Transcription Based on Bayesian Piece-Specific Score Models Capturing Repetitions Eita Nakamura, Kazuyoshi Y oshii, Member, IEEE Abstract --Most work on models for music transcription has focused on describing local sequential dependence of notes in musical scores and failed to capture their global repetitive structure, which can be a useful guide for transcribing music. Focusing on the rhythm, we formulate several classes of Bayesian Markov models of musical scores that describe repetitions indirectly by sparse transition probabilities of notes or note patterns. This enables us to construct piece-specific models for unseen scores with unfixed repetitive structure and to derive tractable inference algorithms. Moreover, to describe approximate repetitions, we explicitly incorporate a process of modifying the repeated notes/note patterns. We apply these models as a prior music language model for rhythm transcription, where piece-specific score models are inferred from performed MIDI data by unsupervised learning, in contrast to the conventional supervised construction of score models. Evaluations using vocal melodies of popular music showed that the Bayesian models improved the transcription accuracy for most of the tested model types, indicating the universal efficacy of the proposed approach. I NTRODUCTION Music transcription is an actively studied but yet unsolved problem in music information processing [1], [2]. One of the goals of music transcription is to convert a music performance signal into a human-readable symbolic musical score. While recent studies have achieved highly accurate pitch detection [3]-[7], it is also necessary to transcribe rhythms in order to obtain symbolic music representation [8]-[18]. Since there are many logically possible representations of rhythms (including meaningless one for humans) for a given performance [11], using a score model that describes prior knowledge about musical scores is a key to solve this problem. A common approach for music transcription is to integrate a musical score (language) model and a performance/acoustic model to obtain a proper transcription that best fits an input performance signal, similarly to the method of statistical speech recognition. More recently, end-to-end approaches have also been attempted [19]-[21], which have been of limited success so far. Manuscript received XX, YY; revised XX, YY . This work was supported partially by JSPS KAKENHI (Nos. The work of EN was supported by the JSPS research fellowship (PD).

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found