Reinforcement Learning with a Terminator Guy T ennenholtz
–Neural Information Processing Systems
We present the problem of reinforcement learning with exogenous termination. We define the Termination Markov Decision Process (TerMDP), an extension of the MDP framework, in which episodes may be interrupted by an external non-Markovian observer.
Neural Information Processing Systems
Aug-19-2025, 15:24:05 GMT
- Country:
- Asia > Middle East
- Israel (0.04)
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Information Technology (0.68)
- Transportation (0.46)