Bootstrapping from Game Tree Search

Veness, Joel, Silver, David, Blair, Alan, Uther, William

Dec-31-2009–Neural Information Processing Systems

In this paper we introduce a new algorithm for updating the parameters of a heuristic evaluation function, by updating the heuristic towards the values computed by an alpha-beta search. Our algorithm differs from previous approaches to learning from search, such as Samuels checkers player and the TD-Leaf algorithm, in two key ways. First, we update all nodes in the search tree, rather than a single node. Second, we use the outcome of a deep search, instead of the outcome of a subsequent search, as the training signal for the evaluation function. We implemented our algorithm in a chess program Meep, using a linear heuristic function. After initialising its weight vector to small random values, Meep was able to learn high quality weights from self-play alone. When tested online against human opponents, Meep played at a master level, the best performance of any chess program with a heuristic learned entirely from self-play.

algorithm, chess, neural network, (20 more...)

Neural Information Processing Systems

Dec-31-2009

Conferences PDF

Add feedback

Country:
- North America
  - Canada > Alberta (0.28)
  - United States > California
    - San Francisco County > San Francisco (0.14)
- Oceania > Australia (0.29)

Genre:
- Research Report > New Finding (0.69)

Industry:
- Leisure & Entertainment > Games > Chess (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Cognitive Science > Problem Solving (1.00)
  - Games (1.00)
  - Machine Learning > Performance Analysis
    - Accuracy (0.42)
  - Representation & Reasoning > Search (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found