A Distributional Analogue to the Successor Representation

Wiltzer, Harley, Farebrother, Jesse, Gretton, Arthur, Tang, Yunhao, Barreto, André, Dabney, Will, Bellemare, Marc G., Rowland, Mark

Feb-13-2024–arXiv.org Machine Learning

This paper contributes a new approach for distributional reinforcement learning which elucidates a clean separation of transition structure and reward in the learning process. Analogous to how the successor representation (SR) describes the expected consequences of behaving according to a given policy, our distributional successor measure (SM) describes the distributional consequences of this behaviour. We formulate the distributional SM as a distribution over distributions and provide theory connecting it with distributional and model-based reinforcement learning. Moreover, we propose an algorithm that learns the distributional SM from data by minimizing a two-level maximum mean discrepancy. Key to our method are a number of algorithmic techniques that are independently valuable for learning generative models of state. As an illustration of the usefulness of the distributional SM, we show that it enables zero-shot risk-sensitive policy evaluation in a way that was not previously possible.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Machine Learning

Feb-13-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - Canada > Quebec (0.14)
  - United States > Massachusetts (0.14)

Genre:
- Research Report (0.82)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Reinforcement Learning (1.00)
  - Representation & Reasoning (1.00)