Restoring Noisy Demonstration for Imitation Learning With Diffusion Models
Chen, Shang-Fu, Yong, Co, Sun, Shao-Hua
–arXiv.org Artificial Intelligence
Abstract--Imitation learning (IL) aims to learn a policy from expert demonstrations and has been applied to various applications. By learning from the expert policy, IL methods do not require environmental interactions or reward signals. However, most existing imitation learning algorithms assume perfect expert demonstrations, but expert demonstrations often contain imperfections caused by errors from human experts or sensor/control system inaccuracies. T o address the above problems, this work proposes a filter-and-restore framework to best leverage expert demonstrations with inherent noise. Our proposed method first filters clean samples from the demonstrations and then learns conditional diffusion models to recover the noisy ones. We evaluate our proposed framework and existing methods in various domains, including robot arm manipulation, dexterous manipulation, and locomotion. The experiment results show that our proposed framework consistently outperforms existing methods across all the tasks. Ablation studies further validate the effectiveness of each component and demonstrate the framework's robustness to different noise types and levels. These results confirm the practical applicability of our framework to noisy offline demonstration data. MIT A TION learning [1]-[13] aims to learn a policy from expert demonstrations and has been applied to various applications, including robotics [8], industrial automation, strategy board games, video games, etc [14]-[19]. Compared to reinforcement learning (RL), acquiring a policy in a trial-and-error manner, which can be unsafe or expensive, imitation learning (IL) algorithms can learn without environmental interactions. Furthermore, while designing sophisticated RL reward functions is often difficult and tedious [20], [21], IL methods learn from expert demonstrations and do not require reward signals. Despite the wide applicability, most existing imitation learning algorithms assume perfect (i.e., optimal and clean) expert demonstrations, which can be challenging and expensive to collect. Specifically, expert demonstrations often contain imperfections caused by errors from human experts or sensor and control system inaccuracies.
arXiv.org Artificial Intelligence
Oct-17-2025
- Country:
- Asia > Taiwan > Taiwan Province > Taipei (0.04)
- Genre:
- Instructional Material > Course Syllabus & Notes (0.54)
- Research Report > New Finding (0.86)
- Industry:
- Leisure & Entertainment > Games > Computer Games (0.66)
- Technology: