Reviews: Hierarchical Reinforcement Learning with Advantage-Based Auxiliary Rewards

Jan-25-2025, 04:18:29 GMT–Neural Information Processing Systems

The paper presents HAAR - a hierarchical reinforcement learning approach that is based on the idea of using the advantage / temporal difference error of the high-level controler provide the reward signal for the lower layer. The reviewers judged this approach to be novel, and empirical results are promising. Analytical results provide improvement guarantees similar to a base algorithm like TRPO. Several areas for improvement were mentioned, and many of these were addressed in the rebuttal. For example, the reviewers were pleased to see the additional experiment showing performance from random skill initialization.

advantage-based auxiliary reward, hierarchical reinforcement learning, observability, (4 more...)

Neural Information Processing Systems

Jan-25-2025, 04:18:29 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)