AITopics | He, Zhengmao

Collaborating Authors

He, Zhengmao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Visual Quadrupedal Loco-Manipulation from Demonstrations

He, Zhengmao, Lei, Kun, Ze, Yanjie, Sreenath, Koushil, Li, Zhongyu, Xu, Huazhe

arXiv.org Artificial IntelligenceMar-29-2024

Quadruped robots are progressively being integrated into human environments. Despite the growing locomotion capabilities of quadrupedal robots, their interaction with objects in realistic scenes is still limited. While additional robotic arms on quadrupedal robots enable manipulating objects, they are sometimes redundant given that a quadruped robot is essentially a mobile unit equipped with four limbs, each possessing 3 degrees of freedom (DoFs). Hence, we aim to empower a quadruped robot to execute real-world manipulation tasks using only its legs. We decompose the loco-manipulation process into a low-level reinforcement learning (RL)-based controller and a high-level Behavior Cloning (BC)-based planner. By parameterizing the manipulation trajectory, we synchronize the efforts of the upper and lower layers, thereby leveraging the advantages of both RL and BC. Our approach is validated through simulations and real-world experiments, demonstrating the robot's ability to perform tasks that demand mobility and high precision, such as lifting a basket from the ground while moving, closing a dishwasher, pressing a button, and pushing a door. Project website: https://zhengmaohe.github.io/leg-manip

artificial intelligence, robot, trajectory, (18 more...)

arXiv.org Artificial Intelligence

2403.20328

Country:

Asia > China (0.28)
North America > United States > California (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Robots > Locomotion (1.00)

Add feedback

Uni-O4: Unifying Online and Offline Deep Reinforcement Learning with Multi-Step On-Policy Optimization

Lei, Kun, He, Zhengmao, Lu, Chenhao, Hu, Kaizhe, Gao, Yang, Xu, Huazhe

arXiv.org Artificial IntelligenceJan-13-2024

Combining offline and online reinforcement learning (RL) is crucial for efficient and safe learning. However, previous approaches treat offline and online learning as separate procedures, resulting in redundant designs and limited performance. We ask: Can we achieve straightforward yet effective offline and online learning without introducing extra conservatism or regularization? In this study, we propose Uni-O4, which utilizes an on-policy objective for both offline and online learning. Owning to the alignment of objectives in two phases, the RL agent can transfer between offline and online learning seamlessly. This property enhances the flexibility of the learning paradigm, allowing for arbitrary combinations of pretraining, fine-tuning, offline, and online learning. In the offline phase, specifically, Uni-O4 leverages diverse ensemble policies to address the mismatch issues between the estimated behavior policy and the offline dataset. Through a simple offline policy evaluation (OPE) approach, Uni-O4 can achieve multi-step policy improvement safely. We demonstrate that by employing the method above, the fusion of these two paradigms can yield superior offline initialization as well as stable and rapid online fine-tuning capabilities. Through real-world robot tasks, we highlight the benefits of this paradigm for rapid deployment in challenging, previously unseen real-world environments. Additionally, through comprehensive evaluations using numerous simulated benchmarks, we substantiate that our method achieves state-of-the-art performance in both offline and offline-to-online fine-tuning learning. Imagine a scenario where a reinforcement learning robot needs to function and improve itself in the real world, the policy of the robot might go through the pipeline of training online in a simulator, then offline with real-world data, and lastly online in the real world. However, current reinforcement learning algorithms usually focus on specific stages of learning, which sophisticates the effort to train robots with a single unified framework. Online RL algorithms require a substantial amount of interaction and exploration to attain strong performance, which is prohibitive in many real-world applications. Offline RL, in which agents learn from a fixed dataset generated by other behavior policies, is a potential solution.

environment step, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2311.03351

Country: Asia > China (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Education > Educational Setting > Online (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

ArrayBot: Reinforcement Learning for Generalizable Distributed Manipulation through Touch

Xue, Zhengrong, Zhang, Han, Cheng, Jingwen, He, Zhengmao, Ju, Yuanchen, Lin, Changyi, Zhang, Gu, Xu, Huazhe

arXiv.org Artificial IntelligenceJun-29-2023

The notion of robotic manipulation [1, 2] easily invokes the image of a biomimetic robot arm or hand trying to grasp tabletop objects and then rearrange them into desired configurations inferred by exteroceptive sensors such as RGBD cameras. To facilitate this manipulation pipeline, the robot learning community has made tremendous efforts in either how to determine steadier grasping poses in demanding scenarios [3, 4, 5, 6, 7] or how to understand the exteroceptive inputs in a more robust and generalizable way [8, 9, 10, 11, 12, 13]. Acknowledging these progresses, this paper attempts to bypass the challenges in the prevailing pipeline by advocating ArrayBot, a reinforcement learning driven system for distributed manipulation [14], where the objects are manipulated through a great number of actuators with only proprioceptive tactile sensing [15, 16, 17, 18]. Conceptually, the hardware of ArrayBot is a 16 16 array of vertically sliding pillars, each of which can be independently actuated, leading to a 16 16 action space. Functionally, the actuators beneath a tabletop object can support its weight and at the same time cooperate to lift, tilt, and even translate it through proper motion policies.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2306.16857

Country: Asia > China (0.14)

Genre: Research Report (0.50)

Industry: Materials (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Manipulation (0.88)

Add feedback