fingering
Dexterous Robotic Piano Playing at Scale
Chen, Le, Zhao, Yi, Schneider, Jan, Gao, Quankai, Guist, Simon, Qian, Cheng, Kannala, Juho, Schölkopf, Bernhard, Pajarinen, Joni, Büchler, Dieter
This work has been submitted to the IEEE for possible publication. Abstract--Endowing robot hands with human-level dexterity has been a long-standing goal in robotics. Bimanual robotic piano playing represents a particularly challenging task: it is high-dimensional, contact-rich, and requires fast, precise control. Our approach is built on three core components. First, we introduce an automatic fingering strategy based on Optimal Transport (OT), allowing the agent to autonomously discover efficient piano-playing strategies from scratch without demonstrations. Second, we conduct large-scale Reinforcement Learning (RL) by training more than 2,000 agents, each specialized in distinct music pieces, and aggregate their experience into a dataset named RP1M++, consisting of over one million trajectories for robotic piano playing. Extensive experiments and ablation studies highlight the effectiveness and scalability of our approach, advancing dexterous robotic piano playing at scale. Achieving human-level dexterity remains one of the central challenges in robotics. The difficulty stems from the breadth of challenges ranging from contact-rich manipulation to dynamic athletic tasks, each posing distinct demands. Manipulation tasks, such as grasping or reorienting objects [1], require sustained application of appropriate forces at moderate speeds across objects with diverse shapes, materials, and weight distributions. Dynamic tasks, such as juggling [2] or table tennis [3], involve frequent contact changes, demand high precision, and allow little tolerance for error due to the rarity of contact opportunities. The combination of requiring both precision and speed makes reproducing human-level dexterity particularly challenging. Q. Gao is with the University of Southern California, CA 90007, United States (e-mail: quankaig@usc.edu). Q. Cheng is with Imperial College London, SW7 2AZ, London, United Kingdom (e-mail: c.qian24@imperial.ac.uk). J. Kannala is with the University of Oulu, 90570 Oulu, Finland. D. B uchler is also with the University of Alberta (Canada), the Alberta Machine Intelligence Institute (Amii), & holds a Canada CIFAR AI Chair.
- North America > Canada > Alberta (0.74)
- North America > United States > California (0.54)
- Europe > Finland > Northern Ostrobothnia > Oulu (0.44)
- (8 more...)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
- Education > Educational Setting > Higher Education (0.54)
- Information Technology > Artificial Intelligence > Robots > Manipulation (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
RP1M: A Large-Scale Motion Dataset for Piano Playing with Bi-Manual Dexterous Robot Hands
Zhao, Yi, Chen, Le, Schneider, Jan, Gao, Quankai, Kannala, Juho, Schölkopf, Bernhard, Pajarinen, Joni, Büchler, Dieter
It has been a long-standing research goal to endow robot hands with human-level dexterity. Bi-manual robot piano playing constitutes a task that combines challenges from dynamic tasks, such as generating fast while precise motions, with slower but contact-rich manipulation problems. Although reinforcement learning based approaches have shown promising results in single-task performance, these methods struggle in a multi-song setting. Our work aims to close this gap and, thereby, enable imitation learning approaches for robot piano playing at scale. To this end, we introduce the Robot Piano 1 Million (RP1M) dataset, containing bi-manual robot piano playing motion data of more than one million trajectories. We formulate finger placements as an optimal transport problem, thus, enabling automatic annotation of vast amounts of unlabeled songs. Benchmarking existing imitation learning approaches shows that such approaches reach state-of-the-art robot piano playing performance by leveraging RP1M.
- North America > United States > California (0.14)
- North America > United States > Massachusetts > Worcester County > Worcester (0.04)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
- (2 more...)
- Leisure & Entertainment (1.00)
- Media > Music (0.68)
From MIDI to Rich Tablatures: an Automatic Generative System incorporating Lead Guitarists' Fingering and Stylistic choices
Bontempi, Pierluigi, Manerba, Daniele, D'Hooge, Alexandre, Canazza, Sergio
Although the automatic identification of the optimal fingering for the performance of melodies on fretted string instruments has already been addressed (at least partially) in the literature, the specific case regarding lead electric guitar requires a dedicated approach. We propose a system that can generate, from simple MIDI melodies, tablatures enriched by fingerings, articulations, and expressive techniques. The basic fingering is derived by solving a constrained and multi-attribute optimization problem, which derives the best position of the fretting hand, not just the finger used at each moment.Then, by analyzing statistical data from the mySongBook corpus, the most common clich{\'e}s and biomechanical feasibility, articulations, and expressive techniques are introduced. Finally, the obtained output is converted into MusicXML format, which allows for easy visualization and use. The quality of the tablatures derived and the high configurability of the proposed approach can have several impacts, in particular in the fields of instrumental teaching, assisted composition and arranging, and computational expressive music performance models.
- Europe > Italy (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > France > Hauts-de-France > Nord > Lille (0.04)
- Asia > Japan (0.04)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
Checklist Models for Improved Output Fluency in Piano Fingering Prediction
Srivatsan, Nikita, Berg-Kirkpatrick, Taylor
In this work we present a new approach for the task of predicting fingerings for piano music. While prior neural approaches have often treated this as a sequence tagging problem with independent predictions, we put forward a checklist system, trained via reinforcement learning, that maintains a representation of recent predictions in addition to a hidden state, allowing it to learn soft constraints on output structure. We also demonstrate that by modifying input representations -- which in prior work using neural models have often taken the form of one-hot encodings over individual keys on the piano -- to encode relative position on the keyboard to the prior note instead, we can achieve much better performance. Additionally, we reassess the use of raw per-note labeling precision as an evaluation metric, noting that it does not adequately measure the fluency, i.e. human playability, of a model's output. To this end, we compare methods across several statistics which track the frequency of adjacent finger predictions that while independently reasonable would be physically challenging to perform in sequence, and implement a reinforcement learning strategy to minimize these as part of our training loss. Finally through human expert evaluation, we demonstrate significant gains in performability directly attributable to improvements with respect to these metrics.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Asia > India > Karnataka > Bengaluru (0.04)
- Leisure & Entertainment (1.00)
- Media > Music (0.93)