Goto

Collaborating Authors

 Khazoom, Charles


Tailoring Solution Accuracy for Fast Whole-body Model Predictive Control of Legged Robots

arXiv.org Artificial Intelligence

Thanks to recent advancements in accelerating non-linear model predictive control (NMPC), it is now feasible to deploy whole-body NMPC at real-time rates for humanoid robots. However, enforcing inequality constraints in real time for such high-dimensional systems remains challenging due to the need for additional iterations. This paper presents an implementation of whole-body NMPC for legged robots that provides low-accuracy solutions to NMPC with general equality and inequality constraints. Instead of aiming for highly accurate optimal solutions, we leverage the alternating direction method of multipliers to rapidly provide low-accuracy solutions to quadratic programming subproblems. Our extensive simulation results indicate that real robots often cannot benefit from highly accurate solutions due to dynamics discretization errors, inertial modeling errors and delays. We incorporate control barrier functions (CBFs) at the initial timestep of the NMPC for the self-collision constraints, resulting in up to a 26-fold reduction in the number of self-collisions without adding computational burden. The controller is reliably deployed on hardware at 90 Hz for a problem involving 32 timesteps, 2004 variables, and 3768 constraints. The NMPC delivers sufficiently accurate solutions, enabling the MIT Humanoid to plan complex crossed-leg and arm motions that enhance stability when walking and recovering from significant disturbances.


Benchmarking Potential Based Rewards for Learning Humanoid Locomotion

arXiv.org Artificial Intelligence

The main challenge in developing effective reinforcement learning (RL) pipelines is often the design and tuning the reward functions. Well-designed shaping reward can lead to significantly faster learning. Naively formulated rewards, however, can conflict with the desired behavior and result in overfitting or even erratic performance if not properly tuned. In theory, the broad class of potential based reward shaping (PBRS) can help guide the learning process without affecting the optimal policy. Although several studies have explored the use of potential based reward shaping to accelerate learning convergence, most have been limited to grid-worlds and low-dimensional systems, and RL in robotics has predominantly relied on standard forms of reward shaping. In this paper, we benchmark standard forms of shaping with PBRS for a humanoid robot. We find that in this high-dimensional system, PBRS has only marginal benefits in convergence speed. However, the PBRS reward terms are significantly more robust to scaling than typical reward shaping approaches, and thus easier to tune.


Humanoid Self-Collision Avoidance Using Whole-Body Control with Control Barrier Functions

arXiv.org Artificial Intelligence

This work combines control barrier functions (CBFs) with a whole-body controller to enable self-collision avoidance for the MIT Humanoid. Existing reactive controllers for self-collision avoidance cannot guarantee collision-free trajectories as they do not leverage the robot's full dynamics, thus compromising kinematic feasibility. In comparison, the proposed CBF-WBC controller can reason about the robot's underactuated dynamics in real-time to guarantee collision-free motions. The effectiveness of this approach is validated in simulation. First, a simple hand-reaching experiment shows that the CBF-WBC enables the robot's hand to deviate from an infeasible reference trajectory to avoid self-collisions. Second, the CBF-WBC is combined with a linear model predictive controller (LMPC) designed for dynamic locomotion, and the CBF-WBC is used to track the LMPC predictions. Walking experiments show that adding CBFs avoids leg self-collisions when the footstep location or swing trajectory provided by the high-level planner are infeasible for the real robot, and generates feasible arm motions that improve disturbance recovery.