Goto

Collaborating Authors

 Reinforcement Learning


Uniform-PACBoundsforReinforcementLearning withLinearFunctionApproximation

Neural Information Processing Systems

Designing efficient reinforcement learning (RL) algorithms for environments with large state and action spaces is one of the main tasks in the RL community.




UnpackingRewardShaping

Neural Information Processing Systems

Much of this work is based on upper confidence bound (UCB) principles and prescribes some kind of exploration bonus to prioritize exploration of rarely visited regions.



Distributional Policy Evaluation: a Maximum Entropy approach to Representation Learning

Neural Information Processing Systems

In Distributional Reinforcement Learning (D-RL) [Bellemare et al., 2023], an agent aims to estimate Sutton and Barto, 2018], where the objective is to predict the expected return only. In Section 3, we answer this methodological question, showing that it is possible to reformulate Policy Evaluation in a distributional setting so that its performance index is explicitly intertwined with the representation of the (state or action) spaces.