Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems

Apr-6-2023, 18:33:38 GMT–Neural Information Processing Systems

Increasing attention has been paid to reinforcement learning algo(cid:173) rithms in recent years, partly due to successes in the theoretical analysis of their behavior in Markov environments. If the Markov assumption is removed, however, neither generally the algorithms nor the analyses continue to be usable. We propose and analyze a new learning algorithm to solve a certain class of non-Markov decision problems. Our algorithm applies to problems in which the environment is Markov, but the learner has restricted access to state information. The algorithm involves a Monte-Carlo pol(cid:173) icy evaluation combined with a policy improvement method that is similar to that of Markov decision problems and is guaranteed to converge to a local maximum.

observable markov decision problem, reinforcement learning algorithm, stochastic policy, (3 more...)

Neural Information Processing Systems

Apr-6-2023, 18:33:38 GMT

Conferences Web Page

Add feedback

Country:
- Asia > Middle East > Jordan (0.09)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (1.00)
  - Learning Graphical Models > Undirected Networks
    - Markov Models (0.40)