Goto

Collaborating Authors

 Reinforcement Learning




6191ab7080c840f67eaf5dff7d5edfcb-Supplemental-Conference.pdf

Neural Information Processing Systems

Diversity in equally-performing policies.We show that different neighborhoods correspond to different post-update return distributions and agent behaviors. We discover that at equal average returns, different policies obtained by the same deep RL algorithm may in fact have substantially different distributional profiles, as measured by statistics of the post-update return distribution.





eb3c8135137c8a60425a0320869ad87e-Paper-Conference.pdf

Neural Information Processing Systems

Recently, reinforcement learning (RL) based approaches have attracted increasing attention for dynamic resource management asRLhelpsautomatically adapttoaspecific userworkload.


NearOptimalExploration-Exploitationin Non-CommunicatingMarkovDecisionProcesses

Neural Information Processing Systems

Reinforcement learning (RL) [1] studies the problem of learning in sequential decision-making problems where the dynamics of the environment is unknown, but can be learnt by performing actions andobserving their outcome inanonline fashion. Asample-efficient RLagent must trade off the explorationneeded to collect information about the environment, and theexploitation of the experience gathered so far to gain as much reward as possible.