Integrating Diffusion-based Multi-task Learning with Online Reinforcement Learning for Robust Quadruped Robot Control
Qin, Xinyao, Ma, Xiaoteng, Qi, Yang, Liu, Qihan, Xue, Chuanyi, Gui, Ning, Dong, Qinyu, Yang, Jun, Liang, Bin
–arXiv.org Artificial Intelligence
Abstract-- Recent research has highlighted the powerful capabilities of imitation learning in robotics. Leveraging generative models, particularly diffusion models, these approaches offer notable advantages such as strong multi-task generalization, effective language conditioning, and high sample efficiency. While their application has been successful in manipulation tasks, their use in legged locomotion remains relatively underexplored, mainly due to compounding errors that affect stability and difficulties in task transition under limited data. Online reinforcement learning (RL) has demonstrated promising results in legged robot control in the past years, which can provide valuable insights to address these challenges. In this work, we propose DMLoco, a diffusion-based framework for quadruped robots that integrates multi-task pretraining with online PPO finetuning to enable language-conditioned control and robust task transitions. Then, the policy is finetuned in simulation to ensure robustness and stable task transition for real-world deployment. By utilizing Denoising Diffusion Implicit Models (DDIM) for efficient sampling and T ensorRT for optimized deployment, our policy runs onboard at 50Hz, offering a scalable and efficient solution for adaptive, language-guided locomotion on resource-constrained robotic platforms.
arXiv.org Artificial Intelligence
Sep-15-2025