board size
Strongly Solving $7 \times 6$ Connect-Four on Consumer Grade Hardware
While the game Connect-Four has been solved mathematically and the best move can be effectively computed with search based methods, a strong solution in the form of a look-up table was believed to be infeasible. In this paper, we revisit a symbolic search method based on binary decision diagrams to produce strong solutions. With our efficient implementation we were able to produce a 89.6 GB large look-up table in 47 hours on a single CPU core with 128 GB main memory for the standard $7 \times 6$ board size. In addition to this win-draw-loss evaluation, we include an alpha-beta search in our open source artifact to find the move which achieves the fastest win or slowest loss.
- Europe > Austria > Vienna (0.14)
- Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
Flexible game-playing AI with AlphaViT: adapting to multiple games and board sizes
This paper presents novel game AI agents based on the AlphaZero framework, enhanced with Vision Transformers (ViT): AlphaViT, AlphaViD, and AlphaVDA. These agents are designed to play various board games of different sizes using a single model, overcoming AlphaZero's limitation of being restricted to a fixed board size. AlphaViT uses only a transformer encoder, while AlphaViD and AlphaVDA contain both an encoder and a decoder. AlphaViD's decoder receives input from the encoder output, while AlphaVDA uses a learnable matrix as decoder input. Using the AlphaZero framework, the three proposed methods demonstrate their versatility in different game environments, including Connect4, Gomoku, and Othello. Experimental results show that these agents, whether trained on a single game or on multiple games simultaneously, consistently outperform traditional algorithms such as Minimax and Monte Carlo tree search using a single DNN with shared weights, while approaching the performance of AlphaZero. In particular, AlphaViT and AlphaViD show strong performance across games, with AlphaViD benefiting from an additional decoder layer that enhances its ability to adapt to different action spaces and board sizes. These results may suggest the potential of transformer-based architectures to develop more flexible and robust game AI agents capable of excelling in multiple games and dynamic environments.
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Europe > Austria (0.04)
- Asia > Japan (0.04)
- Leisure & Entertainment > Games > Chess (0.49)
- Leisure & Entertainment > Games > Computer Games (0.34)
Ordinal Potential-based Player Rating
It was recently observed that Elo ratings fail at preserving transitive relations among strategies and therefore cannot correctly extract the transitive component of a game. We provide a characterization of transitive games as a weak variant of ordinal potential games and show that Elo ratings actually do preserve transitivity when computed in the right space, using suitable invertible mappings. Leveraging this insight, we introduce a new game decomposition of an arbitrary game into transitive and cyclic components that is learnt using a neural network-based architecture and that prioritises capturing the sign pattern of the game, namely transitive and cyclic relations among strategies. We link our approach to the known concept of sign-rank, and evaluate our methodology using both toy examples and empirical data from real-world games.
From Images to Connections: Can DQN with GNNs learn the Strategic Game of Hex?
Keller, Yannik, Blüml, Jannis, Sudhakaran, Gopika, Kersting, Kristian
The gameplay of strategic board games such as chess, Go and Hex is often characterized by combinatorial, relational structures--capturing distinct interactions and non-local patterns--and not just images. A key feature of CNNs is their relational inductive bias towards locality and translational invariance. Hence, we investigate the crucial question: Can GNNs, with their ability to encode complex connections, replace CNNs in self-play reinforcement learning? To this end, we do a comparison with Hex--an abstract yet strategically rich board game--serving as our experimental platform. Our findings reveal that GNNs excel at dealing with long range dependency situations in game states and are less prone to overfitting, but also showing a reduced proficiency in discerning local patterns. This suggests a potential paradigm shift, signaling the use of game-specific structures to reshape self-play reinforcement learning. In 2016, AlphaGo (Silver et al., 2016) became the first AI to beat professional Go players by combining Monte-Carlo tree search (MCTS) with neural network policy and value approximation and selfplay training. Its approach has since been transferred to various other board games: AlphaZero (Silver et al., 2017) achieved superhuman strength in Chess and Shogi while Mohex-3HNN (Gao et al., 2018) has become one of the strongest AI for Hex. Behind these accomplishments lies a crucial observation: Despite the diversity of the board games, all these programs use convolutional neural networks (CNN) as the foundational architecture to approximate policy and value functions. Figure 1: Non-local dependencies exist in many prominent board games. If A is blue, playing at B is a wasted move. These biases align harmoniously for most positions in the game of Go where spatially local patterns dominate.
- North America > Canada > Alberta (0.14)
- Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.05)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Computing Motion Plans for Assembling Particles with Global Control
Blumenberg, Patrick, Schmidt, Arne, Becker, Aaron T.
We investigate motion planning algorithms for the assembly of shapes in the \emph{tilt model} in which unit-square tiles move in a grid world under the influence of uniform external forces and self-assemble according to certain rules. We provide several heuristics and experimental evaluation of their success rate, solution length, runtime, and memory consumption.
- North America > United States > Texas > Harris County > Houston (0.04)
- North America > United States > Iowa (0.04)
- Europe > Germany (0.04)
- Asia > Japan (0.04)
Reinforcement Learning for ConnectX
ConnectX is a two-player game that generalizes the popular game Connect 4. The objective is to get X coins across a row, column, or diagonal of an M x N board. The first player to do so wins the game. The parameters (M, N, X) are allowed to change in each game, making ConnectX a novel and challenging problem. In this paper, we present our work on the implementation and modification of various reinforcement learning algorithms to play ConnectX.
- Asia > India > Maharashtra > Mumbai (0.05)
- Europe > United Kingdom > Scotland (0.04)
Train on Small, Play the Large: Scaling Up Board Games with AlphaZero and GNN
Ben-Assayag, Shai, El-Yaniv, Ran
Playing board games is considered a major challenge for both humans and AI researchers. Because some complicated board games are quite hard to learn, humans usually begin with playing on smaller boards and incrementally advance to master larger board strategies. Most neural network frameworks that are currently tasked with playing board games neither perform such incremental learning nor possess capabilities to automatically scale up. In this work, we look at the board as a graph and combine a graph neural network architecture inside the AlphaZero framework, along with some other innovative improvements. Our ScalableAlphaZero is capable of learning to play incrementally on small boards, and advancing to play on large ones. Our model can be trained quickly to play different challenging board games on multiple board sizes, without using any domain knowledge. We demonstrate the effectiveness of ScalableAlphaZero and show, for example, that by training it for only three days on small Othello boards, it can defeat the AlphaZero model on a large board, which was trained to play the large board for $30$ days.
- North America > United States (0.15)
- Asia > Japan (0.04)
Transfer of Fully Convolutional Policy-Value Networks Between Games and Game Variants
Soemers, Dennis J. N. J., Mella, Vegard, Piette, Eric, Stephenson, Matthew, Browne, Cameron, Teytaud, Olivier
In this paper, we use fully convolutional architectures in AlphaZero-like self-play training setups to facilitate transfer between variants of board games as well as distinct games. We explore how to transfer trained parameters of these architectures based on shared semantics of channels in the state and action representations of the Ludii general game system. We use Ludii's large library of games and game variants for extensive transfer learning evaluations, in zero-shot transfer experiments as well as experiments with additional fine-tuning time.
- Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.05)
- Europe > Netherlands > Limburg > Maastricht (0.04)
- Europe > France > Île-de-France > Paris > Paris (0.04)
Navigating the Landscape of Multiplayer Games to Probe the Drosophila of AI
Omidshafiei, Shayegan, Tuyls, Karl, Czarnecki, Wojciech M., Santos, Francisco C., Rowland, Mark, Connor, Jerome, Hennes, Daniel, Muller, Paul, Perolat, Julien, De Vylder, Bart, Gruslys, Audrunas, Munos, Remi
Multiplayer games have a long history in being used as key testbeds for evaluation and training in artificial intelligence (AI), aptly referred to as the "Drosophila of AI". Traditionally, researchers have focused on using games to build strong AI agents that, e.g., achieve human-level performance. This progress, however, also requires a classification of how 'interesting' a game is for an artificial agent, which requires characterization of games and their topological landscape. Tackling this latter question not only facilitates an understanding of the characteristics of learnt AI agents in games, but can also help determine what game an AI should address next as part of its training. Here, we show how network measures applied to so-called response graphs of large-scale games enable the creation of a useful landscape of games, quantifying the relationships between games of widely varying sizes, characteristics, and complexities. We illustrate our findings in various domains, ranging from well-studied canonical games to significantly more complex empirical games capturing the performance of trained AI agents pitted against one another. Our results culminate in a demonstration of how one can leverage this information to automatically generate new and interesting games, including mixtures of empirical games synthesized from real world games.
- North America > United States (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Portugal > Lisbon > Lisbon (0.04)
- Europe > Moldova (0.04)
Accelerating Self-Play Learning in Go
In 2017, DeepMind's AlphaGoZero demonstrated in a landmark result that it was possible to achieve superhuman performance in the game of Go starting from random play and learning only via reinforcement learning of a neural network using self-play bootstrapping from Monte-Carlo tree search[9]. Moreover, AlphaGoZero used only fairly minimal game-specific tuning. Subsequently, DeepMind's AlphaZero demonstrated that the same methods could also be used to train extremely strong agents in Chess and Shogi. However, the amount of computation required was large, with DeepMind's main reported run taking about 41 TPU-years in total parallelized over 5000 TPUs [8]. The significant cost of reproducing this work has slowed research, putting it out of reach for all but major companies such as Facebook[11], as well as a few online massively distributed computation projects, notably Leela Zero for Go[14], and Leela Chess Zero for Chess[17]. In this paper, we introduce several new techniques, while also reviving some ideas from pre-AlphaZero research in computer Go and newly applying them to the AlphaZero process. Combined with minor domain-specific heuristic optimizations and overall tuning, these ideas greatly improve the efficiency of self-play learning. Still starting only from random play, training on merely about 30 GPUs for a week our bot KataGo reaches just below the strength of Leela Zero as of Leela Zero's 15-block neural net "LZ130", a likely professional or possibly just-superhuman level when run on strong consumer hardware.
- Leisure & Entertainment > Games > Go (1.00)
- Leisure & Entertainment > Games > Chess (0.74)