Offline Goal-conditioned Reinforcement Learning with Quasimetric Representations
–Neural Information Processing Systems
Approaches for goal-conditioned reinforcement learning (GCRL) often use learned state representations to extract goal-reaching policies. Two frameworks for representation structure have yielded particularly effective GCRL algorithms: (1), in which methods learn successor features with a contrastive objective that performs inference over future outcomes, and (2), which link the (quasimetric) distance in representation space to the transit time from states to goals. We propose an approach that unifies these two frameworks, using the structure of a quasimetric representation space (triangle inequality) with the right additional constraints to learn successor representations that enable optimal goal-reaching.
Neural Information Processing Systems
Jun-10-2026, 22:26:18 GMT
- Technology: