Goto

Collaborating Authors

 dmp


NeuralDynamicPolicies forEnd-to-EndSensorimotorLearning

Neural Information Processing Systems

The current dominant paradigm in sensorimotor control, whether imitation or reinforcement learning, is to train policies directly in raw action spaces such as torque, joint angle, or end-effector position. This forces the agent to make decision at each point in training, and hence, limit the scalability to continuous, high-dimensional,andlong-horizontasks.Incontrast,researchinclassicalrobotics has, for a long time, exploited dynamical systems as a policy representation to learn robot behaviors via demonstrations.


354ac345fd8c6d7ef634d9a8e3d47b83-AuthorFeedback.pdf

Neural Information Processing Systems

"is it useful to also learn alpha and beta in eqn 4?", "Does it17 suffice to just learn g and set the forcing term to 0?": We have finished all these three set of experiments. Unlikeourend-to-end architecture, mostpriorworks35 use either a single DMP to represent the whole trajectory or the trajectory is manually segmented to learn different36 DMPs.



Human-robot collaborative transport personalization via Dynamic Movement Primitives and velocity scaling

Franceschi, Paolo, Bussolan, Andrea, Pomponi, Vincenzo, Avram, Oliver, Baraldo, Stefano, Valente, Anna

arXiv.org Artificial Intelligence

Nowadays, industries are showing a growing interest in human-robot collaboration, particularly for shared tasks. This requires intelligent strategies to plan a robot's motions, considering both task constraints and human-specific factors such as height and movement preferences. This work introduces a novel approach to generate personalized trajectories using Dynamic Movement Primitives (DMPs), enhanced with real-time velocity scaling based on human feedback. The method was rigorously tested in industrial-grade experiments, focusing on the collaborative transport of an engine cowl lip section. Comparative analysis between DMP-generated trajectories and a state-of-the-art motion planner (BiTRRT) highlights their adaptability combined with velocity scaling. Subjective user feedback further demonstrates a clear preference for DMP- based interactions. Objective evaluations, including physiological measurements from brain and skin activity, reinforce these findings, showcasing the advantages of DMPs in enhancing human-robot interaction and improving user experience.






TReF-6: Inferring Task-Relevant Frames from a Single Demonstration for One-Shot Skill Generalization

Ding, Yuxuan, Wang, Shuangge, Fitzgerald, Tesca

arXiv.org Artificial Intelligence

Robots often struggle to generalize from a single demonstration due to the lack of a transferable and interpretable spatial representation. In this work, we introduce TReF-6, a method that infers a simplified, abstracted 6DoF Task-Relevant Frame from a single trajectory. Our approach identifies an influence point purely from the trajectory geometry to define the origin for a local frame, which serves as a reference for parameterizing a Dynamic Movement Primitive (DMP). This influence point captures the task's spatial structure, extending the standard DMP formulation beyond start-goal imitation. The inferred frame is semantically grounded via a vision-language model and localized in novel scenes by Grounded-SAM, enabling functionally consistent skill generalization. We validate TReF-6 in simulation and demonstrate robustness to trajectory noise. We further deploy an end-to-end pipeline on real-world manipulation tasks, showing that TReF-6 supports one-shot imitation learning that preserves task intent across diverse object configurations.


KeyMPs: One-Shot Vision-Language Guided Motion Generation by Sequencing DMPs for Occlusion-Rich Tasks

Anarossi, Edgar, Kwon, Yuhwan, Tahara, Hirotaka, Tanaka, Shohei, Shirai, Keisuke, Hamaya, Masashi, Beltran-Hernandez, Cristian C., Hashimoto, Atsushi, Matsubara, Takamitsu

arXiv.org Artificial Intelligence

Dynamic Movement Primitives (DMPs) provide a flexible framework wherein smooth robotic motions are encoded into modular parameters. However, they face challenges in integrating multimodal inputs commonly used in robotics like vision and language into their framework. To fully maximize DMPs' potential, enabling them to handle multimodal inputs is essential. In addition, we also aim to extend DMPs' capability to handle object-focused tasks requiring one-shot complex motion generation, as observation occlusion could easily happen mid-execution in such tasks (e.g., knife occlusion in cake icing, hand occlusion in dough kneading, etc.). A promising approach is to leverage Vision-Language Models (VLMs), which process multimodal data and can grasp high-level concepts. However, they typically lack enough knowledge and capabilities to directly infer low-level motion details and instead only serve as a bridge between high-level instructions and low-level control. To address this limitation, we propose Keyword Labeled Primitive Selection and Keypoint Pairs Generation Guided Movement Primitives (KeyMPs), a framework that combines VLMs with sequencing of DMPs. KeyMPs use VLMs' high-level reasoning capability to select a reference primitive through \emph{keyword labeled primitive selection} and VLMs' spatial awareness to generate spatial scaling parameters used for sequencing DMPs by generalizing the overall motion through \emph{keypoint pairs generation}, which together enable one-shot vision-language guided motion generation that aligns with the intent expressed in the multimodal input. We validate our approach through experiments on two occlusion-rich tasks: object cutting, conducted in both simulated and real-world environments, and cake icing, performed in simulation. These evaluations demonstrate superior performance over other DMP-based methods that integrate VLM support.