interactive pomdp
Learning Others' Intentional Models in Multi-Agent Settings Using Interactive POMDPs
Interactive partially observable Markov decision processes (I-POMDPs) provide a principled framework for planning and acting in a partially observable, stochastic and multi-agent environment. It extends POMDPs to multi-agent settings by including models of other agents in the state space and forming a hierarchical belief structure. In order to predict other agents' actions using I-POMDPs, we propose an approach that effectively uses Bayesian inference and sequential Monte Carlo sampling to learn others' intentional models which ascribe to them beliefs, preferences and rationality in action selection. Empirical results show that our algorithm accurately learns models of the other agent and has superior performance than methods that use subintentional models. Our approach serves as a generalized Bayesian learning algorithm that learns other agents' beliefs, strategy levels, and transition, observation and reward functions.
Reviews: Learning Others' Intentional Models in Multi-Agent Settings Using Interactive POMDPs
The paper describes a sampling method for learning agent behaviors in interactive POMDPs (I-POMDPs). In general, I-POMDPs are a multi-agent POMDP model which, in addition to a belief about the environment state, the belief space includes nested recursive beliefs about the other agents' models. I-POMDP solutions, including the one proposed in the paper, largely approximate using a finite depth with either intentional models of others (e.g., their nested beliefs, state transitions, optimality criterion, etc.) or subintentional models of others (e.g., essentially "summaries of behavior" such as fictitious play). The proposed approach uses samples of the other agent at a particular depth to compute its values and policy. Related work on an interactive particle filter assumed the full frame was known (b, S, A, Omega, T, R, OC).
Learning Others' Intentional Models in Multi-Agent Settings Using Interactive POMDPs
Han, Yanlin, Gmytrasiewicz, Piotr
Interactive partially observable Markov decision processes (I-POMDPs) provide a principled framework for planning and acting in a partially observable, stochastic and multi-agent environment. It extends POMDPs to multi-agent settings by including models of other agents in the state space and forming a hierarchical belief structure. In order to predict other agents' actions using I-POMDPs, we propose an approach that effectively uses Bayesian inference and sequential Monte Carlo sampling to learn others' intentional models which ascribe to them beliefs, preferences and rationality in action selection. Empirical results show that our algorithm accurately learns models of the other agent and has superior performance than methods that use subintentional models. Our approach serves as a generalized Bayesian learning algorithm that learns other agents' beliefs, strategy levels, and transition, observation and reward functions. Papers published at the Neural Information Processing Systems Conference.
Learning Others' Intentional Models in Multi-Agent Settings Using Interactive POMDPs
Han, Yanlin (University of Illinois at Chicago) | Gmytrasiewicz, Piotr (University of Illinois at Chicago)
Interactive partially observable Markov decision processes (I-POMDPs) provide a principled framework for planning and acting in a partially observable, stochastic and multi-agent environment, extending POMDPs to multi-agent settings by including models of other agents in the state space and forming a hierarchical belief structure. In order to predict other agents' actions using I-POMDP, we propose an approach that effectively uses Bayesian inference and sequential Monte Carlo (SMC) sampling to learn others' intentional models which ascribe to them beliefs, preferences and rationality in action selection. Empirical results show that our algorithm accurately learns models of other agents and has superior performance when compared to other methods. Our approach serves as a generalized reinforcement learning algorithm that learns other agents' beliefs, and transition, observation and reward functions. It also effectively mitigates the belief space complexity due to the nested belief hierarchy.
Learning Others' Intentional Models in Multi-Agent Settings Using Interactive POMDPs
Han, Yanlin (University of Illinois at Chicago) | Gmytrasiewicz, Piotr (University of Illinois at Chicago)
Interactive partially observable Markov decision processes (I-POMDPs) provide a principled framework for planning and acting in a partially observable, stochastic and multi-agent environment. It extends POMDPs to multi-agent settings by including models of other agents in the state space and forming a hierarchical belief structure. In order to predict other agents' actions using I-POMDPs, we propose an approach that effectively uses Bayesian inference and sequential Monte Carlo (SMC) sampling to learn others' intentional models which ascribe to them beliefs, preferences and rationality in action selection. Empirical results show that our algorithm accurately learns models of the other agent and has superior performance than other methods. Our approach serves as a generalized Bayesian learning algorithm that learns other agents' beliefs, and transition, observation and reward functions. It also effectively mitigates the belief space complexity due to the nested belief hierarchy.