Goto

Collaborating Authors

 Reinforcement Learning


Zero-Shot Transfer with Deictic Object-Oriented Representation in Reinforcement Learning

Neural Information Processing Systems

Object-oriented representations in reinforcement learning have shown promise in transfer learning, with previous research introducing a propositional objectoriented framework that has provably efficient learning bounds with respect to samplecomplexity.






e6d8545daa42d5ced125a4bf747b3688-AuthorFeedback.pdf

Neural Information Processing Systems

The common specifications in Appendix D are just detailed descriptions of each hyperparameter used in Nature7 DQN paper that we applied to all the baselines and our method for the experiment. Many of the recent reinforcement learning methods require changes in the network structures or require additional20 memory structures (Ephemeral Value Adjustments, RUDDER). The idea of the backward update is not novel and we have stated in section 3.1 that the tabular backward update26 (Algorithm 1) is a special case of Lin's method (1992). The training process of the adaptivescheme is described in Appendix34 A.AlltheKnetworksaretrained using thesame sample episode atthesame time.





Exploration in Structured Reinforcement Learning

Neural Information Processing Systems

Hence, with largestate and action spaces, it is essential to identify and exploit any possible structure existing in the system dynamics and reward function so as to minimize exploration phases and in turn reduce regret to reasonable values. Modern RL algorithms actually implicitly impose some structural properties either in the model parameters (transition probabilities and reward function, see e.g.