Zhao, Mingguo
HiFAR: Multi-Stage Curriculum Learning for High-Dynamics Humanoid Fall Recovery
Chen, Penghui, Wang, Yushi, Luo, Changsheng, Cai, Wenhan, Zhao, Mingguo
-- Humanoid robots encounter considerable difficulties in autonomously recovering from falls, especially within dynamic and unstructured environments. Conventional control methodologies are often inadequate in addressing the complexities associated with high-dimensional dynamics and the contact-rich nature of fall recovery. Meanwhile, reinforcement learning techniques are hindered by issues related to sparse rewards, intricate collision scenarios, and discrepancies between simulation and real-world applications. In this study, we introduce a multi-stage curriculum learning framework, termed HiF AR. This framework employs a staged learning approach that progressively incorporates increasingly complex and high-dimensional recovery tasks, thereby facilitating the robot's acquisition of efficient and stable fall recovery strategies. Furthermore, it enables the robot to adapt its policy to effectively manage real-world fall incidents. We assess the efficacy of the proposed method using a real humanoid robot, showcasing its capability to autonomously recover from a diverse range of falls with high success rates, rapid recovery times, robustness, and generalization.
Robust Quadrupedal Locomotion via Risk-Averse Policy Learning
Shi, Jiyuan, Bai, Chenjia, He, Haoran, Han, Lei, Wang, Dong, Zhao, Bin, Zhao, Mingguo, Li, Xiu, Li, Xuelong
The robustness of legged locomotion is crucial for quadrupedal robots in challenging terrains. Recently, Reinforcement Learning (RL) has shown promising results in legged locomotion and various methods try to integrate privileged distillation, scene modeling, and external sensors to improve the generalization and robustness of locomotion policies. However, these methods are hard to handle uncertain scenarios such as abrupt terrain changes or unexpected external forces. In this paper, we consider a novel risk-sensitive perspective to enhance the robustness of legged locomotion. Specifically, we employ a distributional value function learned by quantile regression to model the aleatoric uncertainty of environments, and perform risk-averse policy learning by optimizing the worst-case scenarios via a risk distortion measure. Extensive experiments in both simulation environments and a real Aliengo robot demonstrate that our method is efficient in handling various external disturbances, and the resulting policy exhibits improved robustness in harsh and uncertain situations in legged locomotion. Videos are available at https://risk-averse-locomotion.github.io/.
Brain-inspired global-local hybrid learning towards human-like intelligence
Wu, Yujie, Zhao, Rong, Zhu, Jun, Chen, Feng, Xu, Mingkun, Li, Guoqi, Song, Sen, Deng, Lei, Wang, Guanrui, Zheng, Hao, Pei, Jing, Zhang, Youhui, Zhao, Mingguo, Shi, Luping
Two main routes of learning methods exist at present including neuroscience-inspired methods and machine learning methods. Both have own advantages, but neither currently can solve all learning problems well. Integrating them into one network may provide better learning abilities for general tasks. On the other hand, spiking neural network embodies "computation" in spatiotemporal domain with unique features of rich coding scheme and threshold switching, which is very suitable for low power and high parallel neuromorphic computing. Here, we report a spike-based general learning model that integrates two learning routes by introducing a brain-inspired meta-local module and a two-phase parametric modelling. The hybrid model can meta-learn general local plasticity, and receive top-down supervision information for multi-scale learning. We demonstrate that this hybrid model facilitates learning of many general tasks, including fault-tolerance learning, few-shot learning and multiple-task learning. Furthermore, the implementation of the hybrid model on the Tianjic neuromorphic platform proves that it can fully utilize the advantages of neuromorphic hardware architecture and promote energy-efficient on-chip applications.