Reviews: Exploration via Hindsight Goal Generation

Jan-23-2025, 22:02:49 GMT–Neural Information Processing Systems

The authors propose a new method for sampling exploration goals when performing goal-conditioned RL with hindsight experience replay. The authors propose a lower bound that depends on some Lipschitz property of the goal-conditioned value function with respect to the distance between the goals and states. The authors demonstrate that across various Fetch-robot tasks, their method, when combined with EBP (a method for relabeling goals), outperforms HER. The authors also perform various ablations that show their method is relatively insensitive to hyperparameter values. Overall, the empirical results are solid, but the math behind the paper is rather troubling.

exploration, hindsight goal generation, learning, (6 more...)

Neural Information Processing Systems

Jan-23-2025, 22:02:49 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.39)