Reviews: Learning Reward Machines for Partially Observable Reinforcement Learning

Jan-23-2025, 17:59:19 GMT–Neural Information Processing Systems

The authors propose a novel approach for solving POMDPs by simultaneously learning and solving reward machines. The method relies on building a finite state machine which properly predicts possible observations and rewards. The authors demonstrate that their method outperforms baselines in three different partially observable gridworlds. Overall, I found the paper clear and well motivated. Learning to solve POMDPs is a very challenging problem and any progress or insight has the potential to have a big impact.

acceptance, learning reward machine, observable reinforcement learning, (4 more...)

Neural Information Processing Systems

Jan-23-2025, 17:59:19 GMT

Conferences Web Page

Add feedback

Genre:
- Research Report (0.58)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Reinforcement Learning (0.40)
  - Representation & Reasoning > Optimization (0.34)