Bridging the Imitation Gap by Adaptive Insubordination
–Neural Information Processing Systems
In practice, imitation learning is preferred over pure reinforcement learning whenever it is possible to design a teaching agent to provide expert supervision. However, we show that when the teaching agent makes decisions with access to privileged information that is unavailable to the student, this information is marginalized during imitation learning, resulting in an imitation gap and, potentially, poor results.
Neural Information Processing Systems
Dec-24-2025, 14:50:54 GMT
- Technology: