Imitation Learning from Imperfection: Theoretical Justifications and Algorithms
–Neural Information Processing Systems
Imitation learning (IL) algorithms excel in acquiring high-quality policies from expert data for sequential decision-making tasks. But, their effectiveness is hampered when faced with limited expert data. To tackle this challenge, a novel framework called (offline) IL with supplementary data has been proposed [25, 61], which enhances learning by incorporating an additional yet imperfect dataset obtained inexpensively from sub-optimal policies. Nonetheless, learning becomes challenging due to the potential inclusion of out-of-expert-distribution samples. In this work, we propose a mathematical formalization of this framework, uncovering its limitations.
Neural Information Processing Systems
Feb-6-2025, 04:42:29 GMT
- Genre:
- Research Report > New Finding (1.00)