Minimax Optimal Online Imitation Learning via Replay Estimation
–Neural Information Processing Systems
Online imitation learning is the problem of how best to mimic expert demonstrations, given access to the environment or an accurate simulator. Prior work has shown that in the infinite sample regime, exact moment matching achieves value equivalence to the expert policy.
Neural Information Processing Systems
Mar-19-2025, 20:24:53 GMT