Goto

Collaborating Authors

 polygame


Learning to Play Stochastic Two-player Perfect-Information Games without Knowledge

Cohen-Solal, Quentin, Cazenave, Tristan

arXiv.org Artificial Intelligence

In this paper, we extend the Descent framework, which enables learning and planning in the context of two-player games with perfect information, to the framework of stochastic games. We propose two ways of doing this, the first way generalizes the search algorithm, i.e. Descent, to stochastic games and the second way approximates stochastic games by deterministic games. We then evaluate them on the game EinStein wurfelt nicht! against state-of-the-art algorithms: Expectiminimax and Polygames (i.e. the Alpha Zero algorithm). It is our generalization of Descent which obtains the best results. The approximation by deterministic games nevertheless obtains good results, presaging that it could give better results in particular contexts.


Transfer of Fully Convolutional Policy-Value Networks Between Games and Game Variants

Soemers, Dennis J. N. J., Mella, Vegard, Piette, Eric, Stephenson, Matthew, Browne, Cameron, Teytaud, Olivier

arXiv.org Artificial Intelligence

In this paper, we use fully convolutional architectures in AlphaZero-like self-play training setups to facilitate transfer between variants of board games as well as distinct games. We explore how to transfer trained parameters of these architectures based on shared semantics of channels in the state and action representations of the Ludii general game system. We use Ludii's large library of games and game variants for extensive transfer learning evaluations, in zero-shot transfer experiments as well as experiments with additional fine-tuning time.


Deep Learning for General Game Playing with Ludii and Polygames

Soemers, Dennis J. N. J., Mella, Vegard, Browne, Cameron, Teytaud, Olivier

arXiv.org Artificial Intelligence

Combinations of Monte-Carlo tree search and Deep Neural Networks, trained through self-play, have produced state-of-the-art results for automated game-playing in many board games. The training and search algorithms are not game-specific, but every individual game that these approaches are applied to still requires domain knowledge for the implementation of the game's rules, and constructing the neural network's architecture -- in particular the shapes of its input and output tensors. Ludii is a general game system that already contains over 500 different games, which can rapidly grow thanks to its powerful and user-friendly game description language. Polygames is a framework with training and search algorithms, which has already produced superhuman players for several board games. This paper describes the implementation of a bridge between Ludii and Polygames, which enables Polygames to train and evaluate models for games that are implemented and run through Ludii. We do not require any game-specific domain knowledge anymore, and instead leverage our domain knowledge of the Ludii system and its abstract state and move representations to write functions that can automatically determine the appropriate shapes for input and output tensors for any game implemented in Ludii. We describe experimental results for short training runs in a wide variety of different board games, and discuss several open problems and avenues for future research.


Minimax Strikes Back

Cohen-Solal, Quentin, Cazenave, Tristan

arXiv.org Artificial Intelligence

Deep Reinforcement Learning (DRL) reaches a superhuman level of play in many complete information games. The state of the art search algorithm used in combination with DRL is Monte Carlo Tree Search (MCTS). We take another approach to DRL using a Minimax algorithm instead of MCTS and learning only the evaluation of states, not the policy. We show that for multiple games it is competitive with the state of the art DRL for the learning performances and for the confrontations.


Polygames: Improved Zero Learning

Cazenave, Tristan, Chen, Yen-Chi, Chen, Guan-Wei, Chen, Shi-Yu, Chiu, Xian-Dong, Dehos, Julien, Elsa, Maria, Gong, Qucheng, Hu, Hengyuan, Khalidov, Vasil, Li, Cheng-Ling, Lin, Hsin-I, Lin, Yu-Jin, Martinet, Xavier, Mella, Vegard, Rapin, Jeremy, Roziere, Baptiste, Synnaeve, Gabriel, Teytaud, Fabien, Teytaud, Olivier, Ye, Shi-Cheng, Ye, Yi-Jun, Yen, Shi-Jim, Zagoruyko, Sergey

arXiv.org Machine Learning

Since DeepMind's AlphaZero, Zero learning quickly became the state-of-the-art method for many board games. It can be improved using a fully convolutional structure (no fully connected layer). Using such an architecture plus global pooling, we can create bots independent of the board size. The training can be made more robust by keeping track of the best checkpoints during the training and by training against them. Using these features, we release Polygames, our framework for Zero learning, with its library of games and its checkpoints. We won against strong humans at the game of Hex in 19x19, which was often said to be untractable for zero learning; and in Havannah. We also won several first places at the TAAI competitions.