Value-of-Information based Arbitration between Model-based and Model-free Control
Bera, Krishn, Mandilwar, Yash, Raju, Bapi
–arXiv.org Artificial Intelligence
There have been numerous attempts in explaining the general learning behaviours using model-based and model-free methods. While the model-based control is flexible yet computationally expensive in planning, the model-free control is quick but inflexible. The model-based control is therefore immune from reward devaluation and contingency degradation. Multiple arbitration schemes have been suggested to achieve the data efficiency and computational efficiency of model-based and model-free control respectively. In this context, we propose a quantitative 'value of information' based arbitration between both the controllers in order to establish a general computational framework for skill learning. The interacting model-based and model-free reinforcement learning processes are arbitrated using an uncertainty-based value of information. We further show that our algorithm performs better than Q-learning as well as Q-learning with experience replay.
arXiv.org Artificial Intelligence
Dec-8-2019
- Country:
- North America > United States > Texas > Travis County > Austin (0.04)
- Genre:
- Research Report (0.64)
- Industry:
- Law > Alternative Dispute Resolution (0.84)
- Technology: