An Environment Model for Nonstationary Reinforcement Learning
Choi, Samuel P. M., Yeung, Dit-Yan, Zhang, Nevin Lianwen
–Neural Information Processing Systems
Reinforcement learning in nonstationary environments is generally regarded as an important and yet difficult problem. This paper partially addresses the problem by formalizing a subclass of nonstationary environments. The environment model, called hidden-mode Markov decision process (HM-MDP), assumes that environmental changes are always confined to a small number of hidden modes.
Neural Information Processing Systems
Dec-31-2000