Reviews: Hierarchical Reinforcement Learning with Advantage-Based Auxiliary Rewards

Jan-25-2025, 04:35:27 GMT–Neural Information Processing Systems

This is an interesting approach and seems novel in the context of options, although it looks to have some similarities to potential based reward shaping, e.g. (Devlin and Kudenko, 2012). The main advantages claimed for HAAR are (loosely) those of improved performance under sparse rewards and the learning of skills appropriate for transfer. These claims could be made more explicit, and that might help to justify the experimental section. The authors define advantage as: A_h(s_t h,a_t h) E[r_t h \gamma_h V_h(s_{t k} h) - V_h(s_{t} h)] The meaning of this is a little ambiguous and I would prefer this to be clarified.

advantage-based auxiliary reward, hierarchical reinforcement learning, representation, (13 more...)

Neural Information Processing Systems

Jan-25-2025, 04:35:27 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)