Smooth UCT Search in Computer Poker

Heinrich, Johannes (University College London) | Silver, David (Google DeepMind)

Jul-15-2015–AAAI Conferences

They concluded that UCT quickly finds Self-play Monte Carlo Tree Search (MCTS) has a good but suboptimal policy, while Outcome Sampling initially been successful in many perfect-information twoplayer learns more slowly but converges to the optimal policy games. Although these methods have been over time. In this paper, we address the question whether the extended to imperfect-information games, so far inability of UCT to converge to a Nash equilibrium can be they have not achieved the same level of practical overcome while retaining UCT's fast initial learning rate. We success or theoretical convergence guarantees focus on the full-game MCTS setting, which is an important as competing methods. In this paper we step towards developing sound variants of online MCTS in introduce Smooth UCT, a variant of the established imperfect-information games. Upper Confidence Bounds Applied to Trees In particular, we introduce Smooth UCT, which combines (UCT) algorithm.

information state, smooth uct, uct, (15 more...)

AAAI Conferences

Jul-15-2015

Conferences PDF

Add feedback

Country:
- Europe > United Kingdom
  - England > Greater London > London (0.04)
- North America
  - Canada > Alberta (0.14)
  - United States > Texas (0.05)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Leisure & Entertainment > Games > Poker (0.52)

Technology:
- Information Technology
  - Artificial Intelligence
    - Games > Poker (1.00)
    - Machine Learning > Learning Graphical Models
      - Undirected Networks > Markov Models (0.46)
    - Representation & Reasoning
      - Planning & Scheduling (1.00)
      - Search (0.90)
  - Game Theory (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found