Monte Carlo Q-learning for General Game Playing
Wang, Hui, Emmerich, Michael, Plaat, Aske
–arXiv.org Artificial Intelligence
After the recent groundbreaking results of AlphaGo, we have seen a strong interest in reinforcement learning in game playing. General Game Playing (GGP) provides a good testbed for reinforcement learning. In GGP, a specification of games rules is given. GGP problems can be solved by reinforcement learning. Q-learning is one of the canonical reinforcement learning methods, and has been used by (Banerjee & Stone, IJCAI 2007) in GGP. In this paper we implement Q-learning in GGP for three small-board games (Tic-Tac-Toe, Connect Four, Hex), to allow comparison to Banerjee et al. As expected, Q-learning converges, although much slower than MCTS. Borrowing an idea from MCTS, we enhance Q-learning with Monte Carlo Search, to give QM-learning. This enhancement improves the performance of pure Q-learning. We believe that QM-learning can also be used to improve performance of reinforcement learning further for larger games, something which we will test in future work.
arXiv.org Artificial Intelligence
May-21-2018
- Country:
- Europe > Netherlands (0.14)
- North America > United States (0.14)
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Leisure & Entertainment > Games > Go (0.35)
- Technology: