Hierarchical Memory-Based Reinforcement Learning
–Neural Information Processing Systems
A key challenge for reinforcement learning is scaling up to large partially observable domains. In this paper, we show how a hier(cid:173) archy of behaviors can be used to create and select among variable length short-term memories appropriate for a task. At higher lev(cid:173) els in the hierarchy, the agent abstracts over lower-level details and looks back over a variable number of high-level decisions in time. We formalize this idea in a framework called Hierarchical Suffix Memory (HSM). HSM uses a memory-based SMDP learning method to rapidly propagate delayed reward across long decision sequences.
Neural Information Processing Systems
Apr-6-2023, 17:02:11 GMT
- Country:
- North America > United States > Michigan > Ingham County
- Lansing (0.10)
- East Lansing (0.10)
- North America > United States > Michigan > Ingham County
- Technology: