Beyond-Expert Performance with Limited Demonstrations: Efficient Imitation Learning with Double Exploration
Zhao, Heyang, Yu, Xingrui, Bossens, David M., Tsang, Ivor W., Gu, Quanquan
–arXiv.org Artificial Intelligence
Imitation learning is a central problem in reinforcement learning where the goal is to learn a policy that mimics the expert's behavior. In practice, it is often challenging to learn the expert policy from a limited number of demonstrations accurately due to the complexity of the state space. Moreover, it is essential to explore the environment and collect data to achieve beyond-expert performance. To overcome these challenges, we propose a novel imitation learning algorithm called Imitation Learning with Double Exploration (ILDE), which implements exploration in two aspects: (1) optimistic policy optimization via an exploration bonus that rewards state-action pairs with high uncertainty to potentially improve the convergence to the expert policy, and (2) curiosity-driven exploration of the states that deviate from the demonstration trajectories to potentially yield beyond-expert performance. Empirically, we demonstrate that ILDE outperforms the state-of-the-art imitation learning algorithms in terms of sample efficiency and achieves beyond-expert performance on Atari and MuJoCo tasks with fewer demonstrations than in previous work. We also provide a theoretical justification of ILDE as an uncertainty-regularized policy optimization method with optimistic exploration, leading to a regret growing sublinearly in the number of episodes.
arXiv.org Artificial Intelligence
Jun-26-2025
- Country:
- Asia (0.28)
- North America > United States
- California (0.28)
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Leisure & Entertainment > Games (0.48)
- Technology:
- Information Technology > Artificial Intelligence
- Robots (1.00)
- Representation & Reasoning > Expert Systems (1.00)
- Machine Learning
- Statistical Learning (1.00)
- Reinforcement Learning (1.00)
- Neural Networks (1.00)
- Information Technology > Artificial Intelligence