chess and shogi
No rules, no problem: DeepMind's MuZero masters games while learning how to play them – TechCrunch
DeepMind has made it a mission to show that not only can an AI truly become proficient at a game, it can do so without even being told the rules. Its newest AI agent, called MuZero, accomplishes this not just with visually simple games with complex strategies, like Go, Chess and Shogi, but with visually complex Atari games. The success of DeepMind's earlier AIs was at least partly due to a very efficient navigation of the immense decision trees that represent the possible actions in a game. In Go or Chess these trees are governed by very specific rules, like where pieces can move, what happens when this piece does that, and so on. The AI that beat world champions at Go, AlphaGo, knew these rules and kept them in mind (or perhaps in RAM) while studying games between and against human players, forming a set of best practices and strategies.
r/MachineLearning - [R] [1911.08265] Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
Much of it is the same as Value Prediction Networks, which proposes that instead of training a model to minimize L2 prediction-loss, you just train it to get the long-term reward/value right for a start state and a series of actions. That gets around a lot of the difficulty of using MBRL for Atari-like things, where it's very hard to accurately predict next pixels. They pretty much simulate a dense tree to some short depth, assign estimated values to the nodes, and use that for action selection. One is that you're probably simulating a lot of states that your value-function would tell you are DEFINITELY not worthwhile. Atari has 16 actions -- it's unfeasible to simulate more than 3 states deep. And since you're simulating in all directions, but only taking the best (e-greedy) action, you're not going to gather training data on most of the transitions you're estimating.
A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play
Computers can beat humans at increasingly complex games, including chess and Go. However, these programs are typically constructed for a particular game, exploiting its properties, such as the symmetries of the board on which it is played. Silver et al. developed a program called AlphaZero, which taught itself to play Go, chess, and shogi (a Japanese version of chess) (see the Editorial, and the Perspective by Campbell). AlphaZero managed to beat state-of-the-art programs specializing in these three games. The ability of AlphaZero to adapt to various game rules is a notable step toward achieving a general game-playing system.
tensorflow/minigo
This is a pure Python implementation of a neural-network based Go AI, using TensorFlow. While inspired by DeepMind's AlphaGo algorithm, this project is not a DeepMind project nor is it affiliated with the official AlphaGo project. Repeat, this is not the official AlphaGo program by DeepMind. This is an independent effort by Go enthusiasts to replicate the results of the AlphaGo Zero paper ("Mastering the Game of Go without Human Knowledge," Nature), with some resources generously made available by Google. Minigo is based off of Brian Lee's "MuGo" -- a pure Python implementation of the first AlphaGo paper "Mastering the Game of Go with Deep Neural Networks and Tree Search" published in Nature.
AlphaZero Annihilates World's Best Chess Bot After Just Four Hours of Practicing
A few months after demonstrating its dominance over the game of Go, DeepMind's AlphaZero AI has trounced the world's top-ranked chess engine--and it did so without any prior knowledge of the game and after just four hours of self-training. AlphaZero is now the most dominant chess playing entity on the planet. In a one-on-one tournament against Stockfish 8, the reigning computer chess champion, the DeepMind-built system didn't lose a single game, winning or drawing all of the 100 matches played. AlphaZero is a modified version of AlphaGo Zero, the AI that recently won all 100 games of Go against its predecessor, AlphaGo. In addition to mastering chess, AlphaZero also developed a proficiency for shogi, a similar Japanese board game.
DeepMind's Groundbreaking AlphaGo Zero AI Is Now a Versatile Gamer
Because chances are it can learn to outsmart you inside a day. Earlier this year, we reported that Alphabet's machine-learning subsidiary, DeepMind, had made a huge advance. Using an artificial-intelligence approach known as reinforcement learning, it had enabled its AlphaGo software to develop superhuman skills for the game of Go without needing human data. Armed with just the rules of the game, the AI was able to make random plays until it developed champion-beating strategies. The new software was dubbed AlphaGo Zero because it didn't need any human input. Now, in a paper published on arXiv, the DeepMind team reports that the software has been generalized so that it can learn other games.