Combining Deep Reinforcement Learning and Search for Imperfect-Information Games

Brown, Noam, Bakhtin, Anton, Lerer, Adam, Gong, Qucheng

Jul-27-2020–arXiv.org Artificial Intelligence

The combination of deep reinforcement learning and search at both training and test time is a powerful paradigm that has led to a number of a successes in single-agent settings and perfect-information games, best exemplified by the success of AlphaZero. However, algorithms of this form have been unable to cope with imperfect-information games. This paper presents ReBeL, a general framework for self-play reinforcement learning and search for imperfect-information games. In the simpler setting of perfect-information games, ReBeL reduces to an algorithm similar to AlphaZero. Results show ReBeL leads to low exploitability in benchmark imperfect-information games and achieves superhuman performance in heads-up no-limit Texas hold'em poker, while using far less domain knowledge than any prior poker AI. We also prove that ReBeL converges to a Nash equilibrium in two-player zero-sum games in tabular settings.

machine learning, reinforcement learning, subgame, (20 more...)

arXiv.org Artificial Intelligence

Jul-27-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Texas (0.25)
  - Rhode Island (0.04)
  - Massachusetts > Middlesex County
    - Cambridge (0.04)
  - California > San Mateo County
    - Menlo Park (0.04)

Genre:
- Research Report > New Finding (0.34)

Industry:
- Leisure & Entertainment > Games > Poker (0.88)

Technology:
- Information Technology
  - Game Theory (1.00)
  - Artificial Intelligence > Machine Learning
    - Reinforcement Learning (1.00)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found