We present an algorithm that identifies the reasoning patterns of agents in a game, by iteratively examining the graph structure of its Multi-Agent Influence Diagram (MAID) representation. If the decision of an agent participates in no reasoning patterns, then we can effectively ignore that decision for the purpose of calculating a Nash equilibrium for the game. In some cases, this can lead to exponential time savings in the process of equilibrium calculation. Moreover, our algorithm can be used to enumerate the reasoning patterns in a game, which can be useful for constructing more effective computerized agents interacting with humans.
In this paper, we investigate the hypothesis that plan recognition can significantly improve the performance of a case-based reinforcement learner in an adversarial action selection task. Our environment is a simplification of an American football game. The performance task is to control the behavior of a quarterback in a pass play, where the goal is to maximize yardage gained. Plan recognition focuses on predicting the play of the defensive team. We modeled plan recognition as an unsupervised learning task, and conducted a lesion study. We found that plan recognition was accurate, and that it significantly improved performance. More generally, our studies show that plan recognition reduced the dimensionality of the state space, which allowed learning to be conducted more effectively. We describe the algorithms, explain the reasons for performance improvement, and also describe a further empirical comparison that highlights the utility of plan recognition for this task.
Eric B. Baum 1 NEC Research Institute, 4 Independence Way, Princeton NJ 08540 eric@research.NJ.NEC.COM Abstract The point of game tree search is to insulate oneself from errors in the evaluation function. The standard approach is to grow a full width tree as deep as time allows, and then value the tree as if the leaf evaluations were exact. This has been effective in many games because of the computational efficiency of the alpha-beta algorithm. A Bayesian would suggest instead to train a model of one's uncertainty. This model adds extra information in addition to the standard evaluation function. Within such a formal model, there is an optimal tree growth procedure and an optimal method of valueing the tree. We describe how to optimally value the tree, and how to approximate on line the optimal tree to search.
Now the whole point of search (as opposed to just picking whichever child looks best to an evaluation function) is to insulate oneself from errors in the evaluation function. When one searches below a node, one gains more information and one's opinion of the value of that node may change. Such "opinion changes" are inherently probabilistic. They occur because one's information or computational abilities are unable to distinguish different states, e.g. a node with a given set of features might have different values. In this paper we adopt a probabilistic model of opinion changes, de-1This is a super-abbreviated discussion of [Baum and Smith, 1993] written by EBB for this conference.
We present a theoretical framework for online probabilistic plan recognition in cooperative multiagent systems. Our model extends the Abstract Hidden Markov Model (AHMM) (Bui, Venkatesh, & West 2002), and consists of a hierarchical dynamic Bayes network that allows reasoning about the interaction among multiple cooperating agents.