HIQL: Offline Goal-Conditioned RL with Latent States as Actions Seohong Park Benjamin Eysenbach
–Neural Information Processing Systems
Unsupervised pre-training has recently become the bedrock for computer vision and natural language processing. In reinforcement learning (RL), goal-conditioned RL can potentially provide an analogous self-supervised approach for making use of large quantities of unlabeled (reward-free) data. However, building effective algorithms for goal-conditioned RL that can learn directly from diverse offline data is challenging, because it is hard to accurately estimate the exact value function for faraway goals. Nonetheless, goal-reaching problems exhibit structure, such that reaching distant goals entails first passing through closer subgoals. This structure can be very useful, as assessing the quality of actions for nearby goals is typically easier than for more distant goals.
Neural Information Processing Systems
Feb-12-2025, 01:27:18 GMT
- Genre:
- Instructional Material (0.67)
- Research Report > New Finding (0.68)
- Industry:
- Education (0.46)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning > Reinforcement Learning (1.00)
- Natural Language (1.00)
- Robots (1.00)
- Information Technology > Artificial Intelligence