Collaborative Loco-Manipulation for Pick-and-Place Tasks with Dynamic Reward Curriculum

An, Tianxu, De Vincenti, Flavio, Ma, Yuntao, Hutter, Marco, Coros, Stelian

arXiv.org Artificial Intelligence 

Abstract--We present a hierarchical reinforcement learning (RL) pipeline for training one-armed legged robots to perform pick-and-place (P&P) tasks end-to-end--from approaching the payload to releasing it at a target area--in both single-robot and cooperative dual-robot settings. We introduce a novel dynamic reward curriculum that enables a single policy to efficiently learn long-horizon P&P operations by progressively guiding the agents through payload-centered sub-objectives. Compared to state-of-the-art approaches for long-horizon RL tasks, our method improves training efficiency by 55% and reduces execution time by 18.6 % in simulation experiments. In the dual-robot case, we show that our policy enables each robot to attend to different components of its observation space at distinct task stages, promoting effective coordination via autonomous attention shifts. T o our knowledge, this is the first RL pipeline that tackles the full scope of collaborative P&P with two legged manipulators. ANY industries rely on human workers to perform physically demanding tasks such as lifting and transporting heavy loads through cluttered environments. Logistics and construction are among the most prominent examples, where workers engage in monotonous, repetitive pick-and-place (P&P) operations with significant risks of physical injury. While advancements in hardware technology have made it possible for robots to traverse complex environments [19, 25] and dynamically manipulate rigid objects [4], the necessary levels of autonomy and coordination for long-horizon mobile manipulation have yet to be effectively demonstrated in real-world scenarios.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found