Control as Hybrid Inference
Tschantz, Alexander, Millidge, Beren, Seth, Anil K., Buckley, Christopher L.
–arXiv.org Artificial Intelligence
The field of reinforcement learning can be split into model-based and model-free methods. Here, we unify these approaches by casting model-free policy optimisation as amortised variational inference, and model-based planning as iterative variational inference, within a `control as hybrid inference' (CHI) framework. We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference. Using a didactic experiment, we demonstrate that the proposed algorithm operates in a model-based manner at the onset of learning, before converging to a model-free algorithm once sufficient data have been collected. We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines. CHI thus provides a principled framework for harnessing the sample efficiency of model-based planning while retaining the asymptotic performance of model-free policy optimisation.
arXiv.org Artificial Intelligence
Jul-11-2020
- Country:
- Europe > United Kingdom
- Scotland > City of Edinburgh > Edinburgh (0.04)
- Asia > Middle East
- Jordan (0.04)
- Europe > United Kingdom
- Genre:
- Research Report (0.82)
- Technology: