DeepMind has shaken the world of Reinforcement Learning and Go with its creation AlphaGo, and later AlphaGo Zero. It is the first computer program to beat a human professional Go player without handicap on a 19 x 19 board. It has also beaten the world champion Lee Sedol 4 games to 1, Ke Jie (number one world ranked player at the time) and many other top ranked players with the Zero version. The game of Go is a difficult environment because of its very large branching factor at every move which makes classical techniques such as alpha-beta pruning and heuristic search unrealistic. I will present my work on reproducing the paper as closely as I could.
A new signature table technique is described together with an improved book learning procedure which is thought to be much superior to the linear polynomial method described earlier. Full use is made of the so called âalpha-betaâ pruning and several forms of forward pruning to restrict the spread of the move tree and to permit the program to look ahead to a much greater depth than it other- wise could do. While still unable to outplay checker masters, the programâs playing ability has been greatly improved.See also:IEEE XploreAnnual Review in Automatic Programming, Volume 6, Part 1, 1969, Pages 1–36Some Studies in Machine Learning Using the Game of CheckersIBM J of Research and Development ll, No.6, 1967,601
In this article we review standard null-move pruning and introduce our extended version of it, which we call verified null-move pruning. In verified null-move pruning, whenever the shallow null-move search indicates a fail-high, instead of cutting off the search from the current node, the search is continued with reduced depth. Our experiments with verified null-move pruning show that on average, it constructs a smaller search tree with greater tactical strength in comparison to standard null-move pruning. Moreover, unlike standard null-move pruning, which fails badly in zugzwang positions, verified null-move pruning manages to detect most zugzwangs and in such cases conducts a re-search to obtain the correct result. In addition, verified null-move pruning is very easy to implement, and any standard null-move pruning program can use verified null-move pruning by modifying only a few lines of code.
With the advent of computers after the end of the Second World War, interest in the development of chess playing programs was stimulated by two seminal papers in this area. The paper by Shannon (1950) remains even to this day to be central importance while the paper by Turing (I 953) is equally influential. The minimax algorithm was first applied in a computer chess context in the landmark paper of Shannon. He also introduced the classification of chess playing programs into either type A or B. Type A are those that search by'brute force' alone, while type B programs try and use some considerable selectivity in deciding which branches of the game tree require searching. Alpha-beta pruning was first formulated by McCarthy at the Dartmouth Summer Research Conference on Artificial Intelligence in 1956.
We like our machines to feel human, even if they don't look it. The pulsing on and off of the power light on an Apple computer when it is "sleeping" is reassuring. Even the red light of HAL in 2001: A Space Odyssey gave an assurance that the machine was alive, rather than a faceless menace. One of the pioneers of computing, Alan Turing, was amongst the first to address the challenge of artificial intelligence and gives his name to the Turing test for a "machine's ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human." Learning from our mistakes makes us human.