Goto

Collaborating Authors

 Ding, Jiatao


Explosive Jumping with Rigid and Articulated Soft Quadrupeds via Example Guided Reinforcement Learning

arXiv.org Artificial Intelligence

Achieving controlled jumping behaviour for a quadruped robot is a challenging task, especially when introducing passive compliance in mechanical design. This study addresses this challenge via imitation-based deep reinforcement learning with a progressive training process. To start, we learn the jumping skill by mimicking a coarse jumping example generated by model-based trajectory optimization. Subsequently, we generalize the learned policy to broader situations, including various distances in both forward and lateral directions, and then pursue robust jumping in unknown ground unevenness. In addition, without tuning the reward much, we learn the jumping policy for a quadruped with parallel elasticity. Results show that using the proposed method, i) the robot learns versatile jumps by learning only from a single demonstration, ii) the robot with parallel compliance reduces the landing error by 11.1%, saves energy cost by 15.2% and reduces the peak torque by 15.8%, compared to the rigid robot without parallel elasticity, iii) the robot can perform jumps of variable distances with robustness against ground unevenness (maximal 4cm height perturbations) using only proprioceptive perception.


Curriculum-Based Reinforcement Learning for Quadrupedal Jumping: A Reference-free Design

arXiv.org Artificial Intelligence

Deep reinforcement learning (DRL) has emerged as a promising solution to mastering explosive and versatile quadrupedal jumping skills. However, current DRL-based frameworks usually rely on well-defined reference trajectories, which are obtained by capturing animal motions or transferring experience from existing controllers. This work explores the possibility of learning dynamic jumping without imitating a reference trajectory. To this end, we incorporate a curriculum design into DRL so as to accomplish challenging tasks progressively. Starting from a vertical in-place jump, we then generalize the learned policy to forward and diagonal jumps and, finally, learn to jump across obstacles. Conditioned on the desired landing location, orientation, and obstacle dimensions, the proposed approach contributes to a wide range of jumping motions, including omnidirectional jumping and robust jumping, alleviating the effort to extract references in advance. Particularly, without constraints from the reference motion, a 90cm forward jump is achieved, exceeding previous records for similar robots reported in the existing literature. Additionally, continuous jumping on the soft grassy floor is accomplished, even when it is not encountered in the training stage. A supplementary video showing our results can be found at https://youtu.be/nRaMCrwU5X8 .


Two-Stage Learning of Highly Dynamic Motions with Rigid and Articulated Soft Quadrupeds

arXiv.org Artificial Intelligence

Controlled execution of dynamic motions in quadrupedal robots, especially those with articulated soft bodies, presents a unique set of challenges that traditional methods struggle to address efficiently. In this study, we tackle these issues by relying on a simple yet effective two-stage learning framework to generate dynamic motions for quadrupedal robots. First, a gradient-free evolution strategy is employed to discover simply represented control policies, eliminating the need for a predefined reference motion. Then, we refine these policies using deep reinforcement learning. Our approach enables the acquisition of complex motions like pronking and back-flipping, effectively from scratch. Additionally, our method simplifies the traditionally labour-intensive task of reward shaping, boosting the efficiency of the learning process. Importantly, our framework proves particularly effective for articulated soft quadrupeds, whose inherent compliance and adaptability make them ideal for dynamic tasks but also introduce unique control challenges.


Nonlinear Model Predictive Control for Robust Bipedal Locomotion: Exploring Angular Momentum and CoM Height Changes

arXiv.org Artificial Intelligence

-- Human beings can utilize multiple balance strategies, e.g. In this work, we propose a novel Nonlinear Model Predictive Control (NMPC) framework for robust locomotion, with the capabilities of step location adjustment, Center of Mass (CoM) height variation, and angular momentum adaptation. These features are realized by constraining the Zero Moment Point within the support polygon. By using the nonlinear inverted pendulum plus flywheel model, the effects of upper-body rotation and vertical height motion are considered. As a result, the NMPC is formulated as a quadratically constrained quadratic program problem, which is solved fast by sequential quadratic programming. Using this unified framework, robust walking patterns that exploit reactive stepping, body inclination, and CoM height variation are generated based on the state estimation. The adaptability for bipedal walking in multiple scenarios has been demonstrated through simulation studies. Humanoid robots have attracted much attention for their capabilities in accomplishing challenging tasks in real-world environments. With several decades passed, state-of-the-art robot platforms such as ASIMO [1], Atlas [2], W ALK-MAN [3], and CogIMon [4] have been developed for this purpose. However, due to the complex nonlinear dynamics of bipedal locomotion over the walking process, enhancing walking stability, which is among the prerequisites in making humanoids practical, still needs further studies. In this paper, inspired by the fact that human beings can make use of the redundant Degree of Freedom (DoF) and adopt various strategies, such as the ankle, hip, and stepping strategies, to realize balance recovery [5]-[7], we aim to develop a versatile and robust walking pattern generator which can integrate multiple balance strategies in a unified way. To generate the walking pattern in a time-efficient manner, simplified dynamic models have been proposed, among which the Linear Inverted Pendulum Model (LIPM) is widely used [8]. Using the LIPM, Kajita et al. proposed the preview control for Zero Moment Point (ZMP) tracking [9]. By adopting a Linear Quadratic Regulator (LQR) scheme, the ankle torque was adjusted to modulate the ZMP trajectory and Center of Mass (CoM) trajectory. Nevertheless, this strategy can neither modulate the step parameters nor take into consideration the feasibility constraints arisen from actuation limitations and environmental constraints. To overcome this drawback, Wieber et al. proposed a Model Predictive Control (MPC) algorithm to utilize the ankle strategy [10] and then extended it for adjusting step location [11].