AITopics | poker

2512.04714

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.46)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Silue, Bram, Amaya-Corredor, Santiago, Mannion, Patrick, Willem, Lander, Libin, Pieter

Hybrid-AIRL: Enhancing Inverse Reinforcement Learning with Supervised Expert Guidance

arXiv.org Artificial IntelligenceNov-27-2025

Adversarial Inverse Reinforcement Learning (AIRL) has shown promise in addressing the sparse reward problem in reinforcement learning (RL) by inferring dense reward functions from expert demonstrations. However, its performance in highly complex, imperfect-information settings remains largely unexplored. To explore this gap, we evaluate AIRL in the context of Heads-Up Limit Hold'em (HULHE) poker, a domain characterized by sparse, delayed rewards and significant uncertainty. In this setting, we find that AIRL struggles to infer a sufficiently informative reward function. To overcome this limitation, we contribute Hybrid-AIRL (H-AIRL), an extension that enhances reward inference and policy learning by incorporating a supervised loss derived from expert data and a stochastic regularization mechanism. We evaluate H-AIRL on a carefully selected set of Gymnasium benchmarks and the HULHE poker setting. Additionally, we analyze the learned reward function through visualization to gain deeper insights into the learning process. Our experimental results show that H-AIRL achieves higher sample efficiency and more stable learning compared to AIRL. This highlights the benefits of incorporating supervised signals into inverse RL and establishes H-AIRL as a promising framework for tackling challenging, real-world settings.

artificial intelligence, machine learning, reinforcement learning, (11 more...)

2511.21356

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Texas (0.04)
Europe > Belgium > Flanders > Antwerp Province > Antwerp (0.04)
(2 more...)

Genre: Research Report > New Finding (0.88)

Industry: Leisure & Entertainment > Games > Computer Games (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Artificial IntelligenceNov-18-2025

Quantifying Skill and Chance: A Unified Framework for the Geometry of Games

Silver, David H.

We introduce a quantitative framework for separating skill and chance in games by modeling them as complementary sources of control over stochastic decision trees. We define the Skill-Luck Index S(G) in [-1, 1] by decomposing game outcomes into skill leverage K and luck leverage L. Applying this to 30 games reveals a continuum from pure chance (coin toss, S = -1) through mixed domains such as backgammon (S = 0, Sigma = 1.20) to pure skill (chess, S = +1, Sigma = 0). Poker exhibits moderate skill dominance (S = 0.33) with K = 0.40 +/- 0.03 and Sigma = 0.80. We further introduce volatility Sigma to quantify outcome uncertainty over successive turns. The framework extends to general stochastic decision systems, enabling principled comparisons of player influence, game balance, and predictive stability, with applications to game design, AI evaluation, and risk assessment.

artificial intelligence, chance node, machine learning, (17 more...)

2511.11611

Country:

North America > United States (0.14)
Europe > Netherlands > Limburg > Maastricht (0.04)

Genre: Research Report (0.84)

Industry: Leisure & Entertainment > Games > Computer Games (0.66)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Jack Serrino, Max Kleiman-Weiner, David C. Parkes, Josh Tenenbaum

Finding Friend and Foe in Multi-Agent Games

Neural Information Processing SystemsNov-17-2025, 19:29:37 GMT

Neural Information Processing Systems http://nips.cc/

deeprole, machine learning, reinforcement learning, (20 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Texas (0.04)
North America > Canada (0.04)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)

Neural Information Processing SystemsNov-15-2025, 19:51:39 GMT

Safe Opponent-Exploitation Subgame Refinement

We provide SES with a theoretically upper-bounded exploitability and a lower-bounded evaluation performance.

artificial intelligence, machine learning, opponent, (18 more...)

Country:

North America > United States > Texas (0.05)
North America > United States > Virginia > Arlington County > Arlington (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games > Poker (0.47)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsNov-15-2025, 16:26:21 GMT

Subgame solving without common knowledge

Current subgame-solving techniques analyze the entire common-knowledge closure of the player's current information set, that is, the smallest set of nodes within which it is common knowledge that the

artificial intelligence, machine learning, subgame, (18 more...)

Country:

North America > United States > Texas (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)

Genre: Research Report (0.67)

Industry: Leisure & Entertainment > Games > Chess (0.49)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Games (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Dewey, Richard, Botyanszki, Janos, Moallemi, Ciamac C., Zheng, Andrew T.

Outbidding and Outbluffing Elite Humans: Mastering Liar's Poker via Self-Play and Reinforcement Learning

arXiv.org Artificial IntelligenceNov-10-2025

AI researchers have long focused on poker-like games as a testbed for environments characterized by multi-player dynamics, imperfect information, and reasoning under uncertainty. While recent breakthroughs have matched elite human play at no-limit Texas hold'em, the multi-player dynamics are subdued: most hands converge quickly with only two players engaged through multiple rounds of bidding. In this paper, we present Solly, the first AI agent to achieve elite human play in reduced-format Liar's Poker, a game characterized by extensive multi-player engagement. We trained Solly using self-play with a model-free, actor-critic, deep reinforcement learning algorithm. Solly played at an elite human level as measured by win rate (won over 50% of hands) and equity (money won) in heads-up and multi-player Liar's Poker. Solly also outperformed large language models (LLMs), including those with reasoning abilities, on the same metrics. Solly developed novel bidding strategies, randomized play effectively, and was not easily exploitable by world-class human players.

large language model, machine learning, reinforcement learning, (23 more...)

2511.03724

Country:

North America > United States > Texas (0.24)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
(6 more...)

Genre: Research Report (0.64)

Industry:

Banking & Finance > Trading (1.00)
Leisure & Entertainment > Games > Poker (0.48)
Leisure & Entertainment > Games > Chess (0.46)
Leisure & Entertainment > Games > Go (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

arXiv.org Artificial IntelligenceOct-22-2025

Nash Policy Gradient: A Policy Gradient Method with Iteratively Refined Regularization for Finding Nash Equilibria

Yu, Eason, Liu, Tzu Hao, Wang, Yunke, Canonne, Clément L., Tran, Nguyen H., Xu, Chang

Finding Nash equilibria in imperfect-information games remains a central challenge in multi-agent reinforcement learning. While regularization-based methods have recently achieved last-iteration convergence to a regularized equilibrium, they require the regularization strength to shrink toward zero to approximate a Nash equilibrium, often leading to unstable learning in practice. Instead, we fix the regularization strength at a large value for robustness and achieve convergence by iteratively refining the reference policy. Our main theoretical result shows that this procedure guarantees strictly monotonic improvement and convergence to an exact Nash equilibrium in two-player zero-sum games, without requiring a uniqueness assumption. Building on this framework, we develop a practical algorithm, Nash Policy Gradient (NashPG), which preserves the generalizability of policy gradient methods while relying solely on the current and reference policies. Empirically, NashPG achieves comparable or lower exploitability than prior model-free methods on classic benchmark games and scales to large domains such as Battleship and No-Limit Texas Hold'em, where NashPG consistently attains higher Elo ratings.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

2510.18183

Country: North America > United States > Texas (0.25)

Genre: Research Report > New Finding (0.48)

Industry: Leisure & Entertainment > Games > Chess (0.35)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Jack Serrino, Max Kleiman-Weiner, David C. Parkes, Josh Tenenbaum

Finding Friend and Foe in Multi-Agent Games

Neural Information Processing SystemsOct-3-2025, 05:41:54 GMT

Neural Information Processing Systems http://nips.cc/

deeprole, machine learning, reinforcement learning, (20 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Texas (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)

Zaciragic, Tarik, Plaat, Aske, Batenburg, K. Joost

Analysis of Bluffing by DQN and CFR in Leduc Hold'em Poker

arXiv.org Artificial IntelligenceSep-5-2025

In the game of poker, being unpredictable, or bluffing, is an essential skill. When humans play poker, they bluff. However, most works on computer-poker focus on performance metrics such as win rates, while bluffing is overlooked. In this paper we study whether two popular algorithms, DQN (based on reinforcement learning) and CFR (based on game theory), exhibit bluffing behavior in Leduc Hold'em, a simplified version of poker. We designed an experiment where we let the DQN and CFR agent play against each other while we log their actions. We find that both DQN and CFR exhibit bluffing behavior, but they do so in different ways. Although both attempt to perform bluffs at different rates, the percentage of successful bluffs (where the opponent folds) is roughly the same. This suggests that bluffing is an essential aspect of the game, not of the algorithm. Future work should look at different bluffing styles and at the full game of poker.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

2509.04125

Country:

North America > United States > Texas (0.05)
Europe > Netherlands > South Holland > Leiden (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Poker (0.88)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)