Efficient Bayesian Policy Reuse with a Scalable Observation Model in Deep Reinforcement Learning

Open in new window