Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning
–Neural Information Processing Systems
This work tackles the problem of robust zero-shot planning in non-stationary stochastic environments. We study Markov Decision Processes (MDPs) evolving over time and consider Model-Based Reinforcement Learning algorithms in this setting. We make two hypotheses: 1) the environment evolves continuously with a bounded evolution rate; 2) a current model is known at each decision epoch but not its evolution. Our contribution can be presented in four points.
model-based reinforcement learning, non-stationary markov decision process, worst-case approach, (8 more...)
Neural Information Processing Systems
Dec-25-2025, 16:20:26 GMT
- Technology: