Since the state space of most games is a directed graph, many game-playing systems detect repeated positions with a transposition table. This approach can reduce search effort by a large margin. However, it suffers from the so-called Graph History Interaction (GHI) problem, which causes errors in games containing repeated positions. This paper presents a practical solution to the GHI problem that combines and extends previous techniques. Because our scheme is general, it is applicable to different game tree search algorithms and to different domains. As demonstrated with the two algorithms αβ and df-pn in the two games checkers and Go, our scheme incurs only a very small overhead, while guaranteeing the correctness of solutions.
In this paper we introduce a new algorithm for updating the parameters of a heuristic evaluation function, by updating the heuristic towards the values computed by an alpha-beta search. Our algorithm differs from previous approaches to learning from search, such as Samuels checkers player and the TD-Leaf algorithm, in two key ways. First, we update all nodes in the search tree, rather than a single node. Second, we use the outcome of a deep search, instead of the outcome of a subsequent search, as the training signal for the evaluation function. We implemented our algorithm in a chess program Meep, using a linear heuristic function. After initialising its weight vector to small random values, Meep was able to learn high quality weights from self-play alone. When tested online against human opponents, Meep played at a master level, the best performance of any chess program with a heuristic learned entirely from self-play.
Transposition tables are a powerful tool in search domains for avoiding duplicate effort and for guiding node expansions. Traditionally, however, they have only been applicable when the current state is exactly the same as a previously explored state. We consider a generalized transposition table, whereby a similarity metric that exploits local structure is used to compare the current state with a neighbourhood of previously seen states. We illustrate this concept and forward pruning based on function approximation in the domain of Skat, and show that we can achieve speedups of 16 over standard methods.
Search is a topic of fundamental importance to artificial intelligence (AI). The range of search strategies investigated stretch from application-independent methods to application-dependent, knowledge-intensive methods. The former has the promise of general applicability, the latter of high performance. An important experimental domain for search algorithms has been the field of game playing. Arguably, this research has been one of the most successful in AI, leading to impressive results in chess (Deep Blue, formerly Deep Thought, playing at Grandmaster strength (Hsu et al. 1990)), checkers (Chinook, the World Man-Machine Champion (Schaeffer et al. 1996)), Othello (Logistello, significantly stronger than all humans (Buro 1994)), and Backgammon (TD-Gammon, playing at World Championship level strength (Tesauro 1995)).