Goto

Collaborating Authors

 Reinforcement Learning










Inferringlearningrulesfromanimaldecision-making

Neural Information Processing Systems

Our method efficiently infers the trial-to-trial changes inananimal'spolicy,and decomposes those changes into a learning component and a noise component.


Fractal Landscapes in Policy Optimization

Neural Information Processing Systems

The understanding of such failure cases is still limited. For instance, the training process of reinforcement learning is unstable and the learning curve can fluctuate during training in ways that are hard to predict. The probability of obtaining satisfactory policies can also be inherently low in reward-sparse or highly nonlinear control tasks.