Goto

Collaborating Authors

 Reinforcement Learning



Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm

Neural Information Processing Systems

During initial iterations of training in most Reinforcement Learning (RL) algorithms, agents perform a significant number of random exploratory steps.



A Self-Tuning Actor-Critic Algorithm

Neural Information Processing Systems

In this paper, we take a step towards addressing this issue by using metagradients to automatically adapt hyperparameters online by meta-gradient descent (Xu et al., 2018).