SAI, a Sensible Artificial Intelligence that plays Go

Morandin, Francesco, Amato, Gianluca, Gini, Rosa, Metta, Carlo, Parton, Maurizio, Pascutto, Gian-Carlo

arXiv.org Artificial Intelligence 

The longstanding challenge in artificial intelligence of playing Go at professional human level has been succesfully tackled in recent works [5, 7, 6], where software tools (AlphaGo, AlphaGo Zero, AlphaZero) combining neural networks and Monte Carlo tree search reached superhuman level. A recent development was Leela Zero [4], an open source software whose neural network is trained over millions of games played in a distributed fashion, thus allowing improvements within reach of the resources of the academic community. However, all these programs suffer from a relevant limitation: it is impossible to target their victory margin. They are trained with a fixed komi of 7.5 and they are built to maximize just the winning probability, not considering the score difference. This has several negative consequences for these programs: when they are ahead, they choose suboptimal moves, and often win by a small margin; they cannot be used with komi 6.5, which is also common in professional games; they show bad play in handicap games, since the winning probability is not a relevant attribute in that situations. In principle all these problems could be overcome by replacing the binary reward (win 1, lose 0) with the game score difference, but the latter is known to be less robust [3, 8] and in general strongest programs use the former since the seminal works [1, 3, 2].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found