Tammelin
Cepheus is the first computer program to essentially solve a game of imperfect information that is played competitively by humans. The game it plays is heads-up limit Texas hold'em poker, a game with over 10 14 information sets, and a challenge problem for artificial intelligence for over 10 years. Cepheus was trained using a new variant of Counterfactual Regret Minimization (CFR), called CFR, using 4800 CPUs running for 68 days. In this paper we describe in detail the engineering details required to make this computation a reality. We also prove the theoretical soundness of CFR and its component algorithm, regret-matching . We further give a hint towards understanding the success of CFR by proving a tracking regret bound for this new regret matching algorithm.
Feb-8-2022, 11:40:20 GMT