The current most popular variant of poker, played in casinos and seen on television, is no-limit Texas hold'em. This game and a smaller variant, limit Texas hold'em, have been used as a testbed for artificial intelligence research since 1997. Since 2006, the Annual Computer Poker Competition has allowed researchers, programmers, and poker players to play their poker programs against each other, allowing us to find out which artificial intelligence techniques work best in practice. The competition has resulted in significant advances in fields such as computational game theory, and resulted in algorithms that can find optimal strategies for games six orders of magnitude larger than was possible using earlier techniques.
Even now, all you have to do is Google search'AI 2017' to find headlines like these: '2017 laid the foundation for faster, smarter AI in 2018' 'All the creepy, crazy and amazing things that happened in AI in 2017' AI took the tech industry by storm. Swarm AI correctly predicted TIME's Person of the Year to be Donald Trump, AI moved into the household through the Amazon Echo and Google Home, and Google's DeepMind AlphaGo Zero conquered the 2,000-year-old board game'Go' through machine learning. If you didn't already know: AlphaGo literally recreated itself without the help of humans, using reinforcement learning to surpass the abilities of world champion Le Sedol and become the best Go player in the world. In 2018, poker bot Libratus was the first to beat 15 top human players, and American technology company Nvidia created AI that could mimic your facial features, handwriting, and voice. They created'celebrities' that don't even exist. Though it didn't impress everyone, with comments like: The iceberg that would later reveal the all-conquering and all-powerful force reckoned to control our entire lives – otherwise known as artificial intelligence.
Computing a good strategy in a large extensive form game often demands an extraordinary amount of computer memory, necessitating the use of abstraction to reduce the game size. Typically, strategies from abstract games perform better in the real game as the granularity of abstraction is increased. This paper investigates two techniques for stitching a base strategy in a coarse abstraction of the full game tree, to expert strategies in fine abstractions of smaller subtrees. We provide a general framework for creating static experts, an approach that generalizes some previous strategy stitching efforts. In addition, we show that static experts can create strong agents for both 2-player and 3-player Leduc and Limit Texas Hold'em poker, and that a specific class of static experts can be preferred among a number of alternatives.
In two-player zero-sum games a Nash equilibrium strategy is guaranteed to win (or tie) in expectation against any opposing strategy by the minimax theorem. In games with m ore than two players there can be multiple equilibria with different values to the players, and follow ing one has no performance guarantee; however, it was shown that a Nash equilibrium strategy defeated a variet y of agents submitted for a class project in a 3-player imperfect-information game, Kuhn poker . Thi s demonstrates that Nash equilibrium strategies can be successful in practice despite the fact that they do no t have a performance guarantee. While Nash equilibrium can be computed in polynomial time fo r two-player zero-sum games, it is PPAD-hard to compute for nonzero-sum and games with 3 or mor e agents and widely believed that no efficient algorithms exist [8, 9]. Counterfactual regret mi nimization (CFR) is an iterative self-play procedure that has been proven to converge to Nash equilibrium in two-p layer zero-sum .
In this tutorial, you will learn step-by-step how to implement a poker bot in Python. First, we need an engine in which we can simulate our poker bot. It also has a GUI available which can graphically display a game. Both the engine and the GUI have excellent tutorials on their GitHub pages in how to use them. The choice for the engine (and/or the GUI) is arbitrary and can be replaced by any engine (and/or GUI) you like.
Poker is considered a good challenge for AI, as it is seen as combination of mathematical/strategic play, and human intuition, especially about the strategies of others. I would consider the game a cross between the two extremes of technical vs. human skill: chess and rock-paper-scissors. In the game of chess, the technically superior player will almost always win, an amateur would lose literally 100% of their games to the top chess playing AI. In rock-paper-scissors, if the top AI plays the perfect strategy, of each option 1/3rd of the time, it will be unbeatable, but also by definition be incapable of beating anyone. To see why let's analyse how it plays against the Bart Simpson strategy: If your opponent always plays rock, you will play rock 1/3rd of the time, paper 1/3rd and scissors 1/3rd, meaning you will tie 1/3rd, win 1/3rd, and lose 1/3rd.
The AI, called Pluribus, defeated poker professional Darren Elias, who holds the record for most World Poker Tour titles, and Chris "Jesus" Ferguson, winner of six World Series of Poker events. Each pro separately played 5,000 hands of poker against five copies of Pluribus. In another experiment involving 13 pros, all of whom have won more than $1 million playing poker, Pluribus played five pros at a time for a total of 10,000 hands and again emerged victorious. "Pluribus achieved superhuman performance at multi-player poker, which is a recognized milestone in artificial intelligence and in game theory that has been open for decades," said Tuomas Sandholm, Angel Jordan Professor of Computer Science, who developed Pluribus with Noam Brown, who is finishing his Ph.D. in Carnegie Mellon's Computer Science Department as a research scientist at Facebook AI. "Thus far, superhuman AI milestones in strategic reasoning have been limited to two-party competition. The ability to beat five other players in such a complicated game opens up new opportunities to use AI to solve a wide variety of real-world problems."
You are right that the algorithms in Pluribus are totally different than reinforcement learning or MCTS. At a high level, that is because our settings are 1) games, that is, there is more than one player, and 2) of imperfect information, that is, when a player has to choose an action, the player does not know the entire state of the world. There is no good textbook on solving imperfect-information games. So, to read up on this literature, you will need to read research papers. Below in this post are selected papers from my research group that would be good to read given that you want to learn about this field.
AI has definitively beaten humans at another of our favorite games. A poker bot, designed by researchers from Facebook's AI lab and Carnegie Mellon University, has bested some of the world's top players in a series of games of six-person no-limit Texas Hold'em poker. Over 12 days and 10,000 hands, the AI system named Pluribus faced off against 12 pros in two different settings. In one, the AI played alongside five human players; in the other, five versions of the AI played with one human player (the computer programs were unable to collaborate in this scenario). Pluribus won an average of $5 per hand with hourly winnings of around $1,000 -- a "decisive margin of victory," according to the researchers.
Poker requires a skill that has always seemed uniquely human: the ability to be devious. To win, players must analyze how their opponents are playing and then trick them into handing over their chips. Such cunning, of course, comes pretty naturally to people. Now an AI program has, for the first time, shown itself capable of outwitting a whole table of poker pros using similar skills.
As Mr. Elias realized, Pluribus knew when to bluff, when to call someone else's bluff and when to vary its behavior so that other players couldn't pinpoint its strategy. "It does all the things the best players in the world do," said Mr. Elias, 32, who has won a record four titles on the World Poker Tour. "And it does a few things humans have a hard time doing." Experts believe the techniques that drive this and similar systems could be used in Wall Street trading, auctions, political negotiations and cybersecurity, activities that, like poker, involve hidden information. "You don't always know the state of the real world," said Noam Brown, the Facebook researcher who oversaw the Pluribus project.