We derive a recursive formula for expected utility values in imperfect-information game trees, and an imperfectinformation game tree search algorithm based on it. The formula and algorithm are general enough to incorporate a wide variety of opponent models. We analyze two opponent models. The "paranoid" model is an information-set analog of the minimax rule used in perfect-information games. The "overconfident" model assumes the opponent moves randomly. Our experimental tests in the game of kriegspiel chess (an imperfect-information variant of chess) produced surprising results: (1) against each other, and against one of the kriegspiel algorithms presented at IJCAI-05, the overconfident model usually outperformed the paranoid model; (2) the performance of both models depended greatly on how well the model corresponded to the opponent's behavior. These results suggest that the usual assumption of perfect-information game tree search--that the opponent will choose the best possible move--isn't as useful in imperfect-information games.
Making decisions under uncertainty remains one of the central problems in AI research. An agent in an uncertain world needs to select actions from the action search space -- the set of all possible actions in that world. As the uncertainty increases, this task can become increasingly dif cult. The number of possible actions may increase, the number of possible situations in which those actions may be applied may increase, or both. The effects of these growing search spaces are ampli ed as the agent tries to search further ahead.
Monte Carlo tree search has brought significant improvements to the level of computer players in games such as Go, but so far it has not been used very extensively in games of strongly imperfect information with a dynamic board and an emphasis on risk management and decision making under uncertainty. In this paper we explore its application to the game of Kriegspiel (invisible chess), providing three Monte Carlo methods of increasing strength for playing the game with little specific knowledge. We compare these Monte Carlo agents to the strongest known minimax-based Kriegspiel player, obtaining significantly better results with a considerably simpler logic and less domain-specific knowledge.
This paper provides a complexity analysis for the game of reconnaissance blind chess (RBC), a recently-introduced variant of chess where each player does not know the positions of the opponent's pieces a priori but may reveal a subset of them through chosen, private sensing actions. In contrast to commonly studied imperfect information games like poker and Kriegspiel, an RBC player does not know what the opponent knows or has chosen to learn, exponentially expanding the size of the game's information sets (i.e., the number of possible game states that are consistent with what a player has observed). Effective RBC sensing and moving strategies must account for the uncertainty of both players, an essential element of many real-world decision-making problems. Here we evaluate RBC from a game theoretic perspective, tracking the proliferation of information sets from the perspective of selected canonical bot players in tournament play. We show that, even for effective sensing strategies, the game sizes of RBC compare to those of Go while the average size of a player's information set throughout an RBC game is much greater than that of a player in Heads-up Limit Hold 'Em. We compare these measures of complexity among different playing algorithms and provide cursory assessments of the various sensing and moving strategies.
Ng, Brenda (Lawrence Livermore National Laboratory) | Meyers, Carol (Lawrence Livermore National Laboratory) | Boakye, Kofi (Lawrence Livermore National Laboratory) | Nitao, John (Lawrence Livermore National Laboratory)
We examine the suitability of using decision processes to model real-world systems of intelligent adversaries. Decision processes have long been used to study cooperative multiagent interactions, but their practical applicability to adversarial problems has received minimal study. We address the pros and cons of applying sequential decision-making in this area, using the crime of money laundering as a specific example. Motivated by case studies, we abstract out a model of the money laundering process, using the framework of interactive partially observable Markov decision processes (I-POMDPs). We address why this framework is well suited for modeling adversarial interactions. Particle filtering and value iteration are used to solve the model, with the application of different pruning and look-ahead strategies to assess the tradeoffs between solution quality and algorithmic run time. Our results show that there is a large gap in the level of realism that can currently be achieved by such decision models, largely due to computational demands that limit the size of problems that can be solved. While these results represent solutions to a simplified model of money laundering, they illustrate nonetheless the kinds of agent interactions that cannot be captured by standard approaches such as anomaly detection. This implies that I-POMDP methods may be valuable in the future, when algorithmic capabilities have further evolved.