Exponentially Weighted Imitation Learning for Batched Historical Data
Qing Wang, Jiechao Xiong, Lei Han, peng sun, Han Liu, Tong Zhang
–Neural Information Processing Systems
We consider deep policy learning with only batched historical trajectories. The main challenge of this problem is that the learner no longer has a simulator or "environment oracle" as in most reinforcement learning settings.
Neural Information Processing Systems
Feb-12-2026, 18:47:41 GMT
- Country:
- North America > Canada > Quebec > Montreal (0.04)
- Industry:
- Leisure & Entertainment (0.46)
- Technology: