Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning

Lecarpentier, Erwan, Rachelson, Emmanuel

Mar-18-2020, 23:31:31 GMT–Neural Information Processing Systems

This work tackles the problem of robust zero-shot planning in non-stationary stochastic environments. We study Markov Decision Processes (MDPs) evolving over time and consider Model-Based Reinforcement Learning algorithms in this setting. We make two hypotheses: 1) the environment evolves continuously with a bounded evolution rate; 2) a current model is known at each decision epoch but not its evolution. Our contribution can be presented in four points. We introduce the notion of regular evolution by making an hypothesis of Lipschitz-Continuity on the transition and reward functions w.r.t.

model-based reinforcement learning, non-stationary markov decision process, worst-case approach, (6 more...)

Neural Information Processing Systems

Mar-18-2020, 23:31:31 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (0.99)
  - Learning Graphical Models > Undirected Networks
    - Markov Models (0.65)