A Reinforcement Learning Algorithm in Partially Observable Environments Using Short-Term Memory

Apr-6-2023, 17:29:07 GMT–Neural Information Processing Systems

We describe a Reinforcement Learning algorithm for partially observ(cid:173) able environments using short-term memory, which we call BLHT. Since BLHT learns a stochastic model based on Bayesian Learning, the over(cid:173) fitting problem is reasonably solved. Moreover, BLHT has an efficient implementation. This paper shows that the model learned by BLHT con(cid:173) verges to one which provides the most accurate predictions of percepts and rewards, given short-term memory.

observable environment, reinforcement learning algorithm, short-term memory, (2 more...)

Neural Information Processing Systems

Apr-6-2023, 17:29:07 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)