Gmytrasiewicz, Piotr
How to Do Things with Words: A Bayesian Approach
Gmytrasiewicz, Piotr (University of Illinois at Chicago)
Communication changes the beliefs of the listener and of the speaker. The value of a communicative act stems from the valuable belief states which result from this act. To model this we build on the Interactive POMDP (IPOMDP) framework, which extends POMDPs to allow agents to model others in multi-agent settings, and we include communication that can take place between the agents to formulate Communicative IPOMDPs (CIPOMDPs). We treat communication as a type of action and therefore, decisions regarding communicative acts are based on decision-theoretic planning using the Bellman optimality principle and value iteration, just as they are for all other rational actions. As in any form of planning, the results of actions need to be precisely specified. We use the Bayes' theorem to derive how agents update their beliefs in CIPOMDPs; updates are due to agents' actions, observations, messages they send to other agents, and messages they receive from others. The Bayesian decision-theoretic approach frees us from the commonly made assumption of cooperative discourse - we consider agents which are free to be dishonest while communicating and are guided only by their selfish rationality. We use a simple Tiger game to illustrate the belief update, and to show that the ability to rationally communicate allows agents to improve efficiency of their interactions.
Learning Others' Intentional Models in Multi-Agent Settings Using Interactive POMDPs
Han, Yanlin, Gmytrasiewicz, Piotr
Interactive partially observable Markov decision processes (I-POMDPs) provide a principled framework for planning and acting in a partially observable, stochastic and multi-agent environment. It extends POMDPs to multi-agent settings by including models of other agents in the state space and forming a hierarchical belief structure. In order to predict other agents' actions using I-POMDPs, we propose an approach that effectively uses Bayesian inference and sequential Monte Carlo sampling to learn others' intentional models which ascribe to them beliefs, preferences and rationality in action selection. Empirical results show that our algorithm accurately learns models of the other agent and has superior performance than methods that use subintentional models. Our approach serves as a generalized Bayesian learning algorithm that learns other agents' beliefs, strategy levels, and transition, observation and reward functions.
Learning Others' Intentional Models in Multi-Agent Settings Using Interactive POMDPs
Han, Yanlin, Gmytrasiewicz, Piotr
Interactive partially observable Markov decision processes (I-POMDPs) provide a principled framework for planning and acting in a partially observable, stochastic and multi-agent environment. It extends POMDPs to multi-agent settings by including models of other agents in the state space and forming a hierarchical belief structure. In order to predict other agents' actions using I-POMDPs, we propose an approach that effectively uses Bayesian inference and sequential Monte Carlo sampling to learn others' intentional models which ascribe to them beliefs, preferences and rationality in action selection. Empirical results show that our algorithm accurately learns models of the other agent and has superior performance than methods that use subintentional models. Our approach serves as a generalized Bayesian learning algorithm that learns other agents' beliefs, strategy levels, and transition, observation and reward functions.
Learning Others' Intentional Models in Multi-Agent Settings Using Interactive POMDPs
Han, Yanlin (University of Illinois at Chicago) | Gmytrasiewicz, Piotr (University of Illinois at Chicago)
Interactive partially observable Markov decision processes (I-POMDPs) provide a principled framework for planning and acting in a partially observable, stochastic and multi-agent environment, extending POMDPs to multi-agent settings by including models of other agents in the state space and forming a hierarchical belief structure. In order to predict other agents' actions using I-POMDP, we propose an approach that effectively uses Bayesian inference and sequential Monte Carlo (SMC) sampling to learn others' intentional models which ascribe to them beliefs, preferences and rationality in action selection. Empirical results show that our algorithm accurately learns models of other agents and has superior performance when compared to other methods. Our approach serves as a generalized reinforcement learning algorithm that learns other agents' beliefs, and transition, observation and reward functions. It also effectively mitigates the belief space complexity due to the nested belief hierarchy.
Learning Others' Intentional Models in Multi-Agent Settings Using Interactive POMDPs
Han, Yanlin (University of Illinois at Chicago) | Gmytrasiewicz, Piotr (University of Illinois at Chicago)
Interactive partially observable Markov decision processes (I-POMDPs) provide a principled framework for planning and acting in a partially observable, stochastic and multi-agent environment. It extends POMDPs to multi-agent settings by including models of other agents in the state space and forming a hierarchical belief structure. In order to predict other agents' actions using I-POMDPs, we propose an approach that effectively uses Bayesian inference and sequential Monte Carlo (SMC) sampling to learn others' intentional models which ascribe to them beliefs, preferences and rationality in action selection. Empirical results show that our algorithm accurately learns models of the other agent and has superior performance than other methods. Our approach serves as a generalized Bayesian learning algorithm that learns other agents' beliefs, and transition, observation and reward functions. It also effectively mitigates the belief space complexity due to the nested belief hierarchy.
Interactive Agent that Understands the User
Gmytrasiewicz, Piotr (University of Illinois at Chicago) | Moe, George (Harvard University) | Morena, Adolfo (University of Illinois at Chicago)
Our work uses the notion of theory of mind to enable an interactive agent to keep track of the state of knowledge, goals and intentions of the human user, and to engage in and initiate sophisticated interactive behaviors using decision-theoretic paradigm of maximizing expected utility. Currently, systems like Google Now and Siri mostly react to user’s requests and commands using hand-crafted responses, but they cannot initiate intelligent communication and plan for longer term interactions. The reason is that they lack a clearly defined general objective of the interaction. Our main premise is that communication and interaction are types of action, so planning for communicative and interactive actions should be based on a unified framework of decisiontheoretic planning. To facilitate this, the system’s state of knowledge (a mental model) about the world has to include probabilistic representation of what is known, what is uncertain, and how things change as different events transpire. Further, the state of user’s knowledge and intentions (the theory of the user’s mind) needs to include precise specification of what the system knows, and how uncertain it is, about the user’s mental model, and about her desires and intentions. The theories of mind may be further nested to form interactive beliefs. Finally, decision-theoretic planning proposes that desirability of possible sequences of interactive and communicative actions be assessed as expected utilities of alternative plans.We describe our preliminary implementation using the Open CYC system, called MARTHA, and illustrate it in action using two simple interactive scenarios.
Bayesian Learning of Other Agents' Finite Controllers for Interactive POMDPs
Panella, Alessandro (University of Illinois at Chicago) | Gmytrasiewicz, Piotr (University of Illinois at Chicago)
We consider an autonomous agent operating in a stochastic, partially-observable, multiagent environment, that explicitly models the other agents as probabilistic deterministic finite-state controllers (PDFCs) in order to predict their actions. We assume that such models are not given to the agent, but instead must be learned from (possibly imperfect) observations of the other agents' behavior. The agent maintains a belief over the other agents' models, that is updated via Bayesian inference. To represent this belief we place a flexible stick-breaking distribution over PDFCs, that allows the posterior to concentrate around controllers whose size is not bounded and scales with the complexity of the observed data. Since this Bayesian inference task is not analytically tractable, we devise a Markov chain Monte Carlo algorithm to approximate the posterior distribution. The agent then embeds the result of this inference into its own decision making process using the interactive POMDP framework. We show that our learning algorithm can learn agent models that are behaviorally accurate for problems of varying complexity, and that the agent's performance increases as a result.
MARTHA Speaks: Implementing Theory of Mind for More Intuitive Communicative Acts
Gmytrasiewicz, Piotr (Univeristy of Illinois at Chicago) | Moe, George Herbert (Illinois Mathematics and Science Academy) | Moreno, Adolfo (University of Illinois at Chicago)
The theory of mind is an important human capability that allows us to understand and predict the goals, intents, and beliefs of other individuals. We present an approach to designing intelligent communicative agents based on modeling theories of mind. This can be tricky because other agents may also have their own theories of mind of the first agent, meaning that these mental models are naturally nested in layers. So, to look for intuitive communicative acts, we recursively apply a planning algorithm in each of these nested layers, looking for possible plans of action as well as their hypothetical consequences, which include the reactions of other agents; we propose that truly intelligent communicative acts are the ones which produce a state of maximum decision theoretic utility according to the entire theory of mind. We implement these ideas using Java and OpenCyc in an attempt to create an assistive AI we call MARTHA. We demonstrate MARTHA's capabilities with two motivating examples: helping the user buy a sandwich and helping the user search for an activity. We see that, in addition to being a personal assistant, MARTHA can be extended to other assistive fields, such as finance, research, and government.
MARTHA Speaks: Implementing Theory of Mind for More Intuitive Communicative Acts
Gmytrasiewicz, Piotr (University of Illinois at Chicago) | Moe, George (Illinois Mathematics and Science Academy) | Moreno, Adolfo (University of Illinois at Chicago)
The theory of mind is an important human capability that allows us to understand and predict the goals, intents, and beliefs of other individuals. We present an approach to designing intelligent communicative agents based on modeling theories of mind. This can be tricky because other agents may also have their own theories of mind of the first agent, meaning that these mental models are naturally nested in layers. So, to look for intuitive communicative acts, we recursively apply a planning algorithm in each of these nested layers, looking for possible plans of action as well as their hypothetical consequences, which include the reactions of other agents; we propose that truly intelligent communicative acts are the ones which produce a state of maximum decision theoretic utility according to the entire theory of mind. We implement these ideas using Java and OpenCyc in an attempt to create an assistive AI we call MARTHA. We demonstrate MARTHA's capabilities with two motivating examples: helping the user buy a sandwich and helping the user search for an activity. We see that, in addition to being a personal assistant, MARTHA can be extended to other assistive fields, such as finance, research, and government.
Modeling Bounded Rationality of Agents During Interactions
Guo, Qing (University of Illinois at Chicago) | Gmytrasiewicz, Piotr (University of Illinois at Chicago)
Frequently, it is advantageous for an agent to model other agents in order to predict their behavior during an interaction. Modeling others as rational has a long tradition in AI and game theory, but modeling other agents’ departures from rationality is difficult and controversial. This paper proposes that bounded rationality be modeled as errors the agent being modeled is making while deciding on its action. We are motivated by the work on quantal response equilibria in behavioral game theory which uses Nash equilibria as the solution concept. In contrast, we use decision-theoretic maximization of expected utility. Quantal response assumes that a decision maker is rational, i.e., is maximizing his expected utility, but only approximately so, with an error rate characterized by a single error parameter. Another agent’s error rate may be unknown and needs to be estimated during an interaction. We show that the error rate of the quantal response can be estimated using Bayesian update of a suitable conjugate prior, and that it has a finitely dimensional sufficient statistic under strong simplifying assumptions. However, if the simplifying assumptions are relaxed, the quantal response does not admit a finite sufficient statistic and a more complex update is needed. This confirms the difficulty of using simple models of bounded rationality in general settings.