AlphaGo Zero demystified

#artificialintelligence 

DeepMind has shaken the world of Reinforcement Learning and Go with its creation AlphaGo, and later AlphaGo Zero. It is the first computer program to beat a human professional Go player without handicap on a 19 x 19 board. It has also beaten the world champion Lee Sedol 4 games to 1, Ke Jie (number one world ranked player at the time) and many other top ranked players with the Zero version. The game of Go is a difficult environment because of its very large branching factor at every move which makes classical techniques such as alpha-beta pruning and heuristic search unrealistic. I will present my work on reproducing the paper as closely as I could. This article will again require background knowledge in Machine Learning and Python, as I will make references to my own implementation.