Spaan, Matthijs


Efficient Offline Communication Policies for Factored Multiagent POMDPs

Neural Information Processing Systems

Factored Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) form a powerful framework for multiagent planning under uncertainty, but optimal solutions require a rigid history-based policy representation. In this paper we allow inter-agent communication which turns the problem in a centralized Multiagent POMDP (MPOMDP). We map belief distributions over state factors to an agent's local actions by exploiting structure in the joint MPOMDP policy. The key point is that when sparse dependencies between the agents' decisions exist, often the belief over its local state factors is sufficient for an agent to unequivocally identify the optimal action, and communication can be avoided. We formalize these notions by casting the problem into convex optimization form, and present experimental results illustrating the savings in communication that we can obtain.


The 2015 AAAI Fall Symposium Series Reports

AI Magazine

The Association for the Advancement of Artificial Intelligence presented the 2015 Fall Symposium Series, on Thursday through Saturday, November 12-14, at the Westin Arlington Gateway in Arlington, Virginia. The titles of the six symposia were as follows: AI for Human-Robot Interaction, Cognitive Assistance in Government and Public Sector Applications, Deceptive and Counter-Deceptive Machines, Embedded Machine Learning, Self-Confidence in Autonomous Systems, and Sequential Decision Making for Intelligent Agents. This article contains the reports from four of the symposia.


The 2015 AAAI Fall Symposium Series Reports

AI Magazine

The Association for the Advancement of Artificial Intelligence presented the 2015 Fall Symposium Series, on Thursday through Saturday, November 12-14, at the Westin Arlington Gateway in Arlington, Virginia. The titles of the six symposia were as follows: AI for Human-Robot Interaction, Cognitive Assistance in Government and Public Sector Applications, Deceptive and Counter-Deceptive Machines, Embedded Machine Learning, Self-Confidence in Autonomous Systems, and Sequential Decision Making for Intelligent Agents. This article contains the reports from four of the symposia.


Point-Based POMDP Solving with Factored Value Function Approximation

AAAI Conferences

Partially observable Markov decision processes (POMDPs) provide a principled mathematical framework for modeling autonomous decision-making problems. A POMDP solution is often represented by a value function comprised of a set of vectors. In the case of factored models, the size of these vectors grows exponentially with the number of state factors, leading to scalability issues. We consider an approximate value function representation based on a linear combination of basis functions. In particular, we present a backup operator that can be used in any point-based POMDP solver. Furthermore, we show how under certain conditions independence between observation factors can be exploited for large computational gains. We experimentally verify our contributions and show that they have the potential to improve point-based methods in policy quality and solution size.


Veiga

AAAI Conferences

Partially observable Markov decision processes (POMDPs) provide a principled mathematical framework for modeling autonomous decision-making problems. A POMDP solution is often represented by a value function comprised of a set of vectors. In the case of factored models, the size of these vectors grows exponentially with the number of state factors, leading to scalability issues. We consider an approximate value function representation based on a linear combination of basis functions. In particular, we present a backup operator that can be used in any point-based POMDP solver. Furthermore, we show how under certain conditions independence between observation factors can be exploited for large computational gains. We experimentally verify our contributions and show that they have the potential to improve point-based methods in policy quality and solution size.


Bounded Approximations for Linear Multi-Objective Planning Under Uncertainty

AAAI Conferences

Planning under uncertainty poses a complex problem in which multiple objectives often need to be balanced. When dealing with multiple objectives, it is often assumed that the relative importance of the objectives is known a priori. However, in practice human decision makers often find it hard to specify such preferences, and would prefer a decision support system that presents a range of possible alternatives. We propose two algorithms for computing these alternatives for the case of linearly weighted objectives. First, we propose an anytime method, approximate optimistic linear support (AOLS), that incrementally builds up a complete set of ε-optimal plans, exploiting the piecewise linear and convex shape of the value function. Second, we propose an approximate anytime method, scalarised sample incremental improvement (SSII), that employs weight sampling to focus on the most interesting regions in weight space, as suggested by a prior over preferences. We show empirically that our methods are able to produce (near-)optimal alternative sets orders of magnitude faster than existing techniques.


Decentralized Multi-Robot Cooperation with Auctioned POMDPs

AAAI Conferences

Planning under uncertainty faces a scalability problem when considering multi-robot teams, as the information space scales exponentially with the number of robots. To address this issue, this paper proposes to decentralize multi-robot Partially Observable Markov Decision Processes (POMDPs) while maintaining cooperation between robots by using POMDP policy auctions. Auctions provide a flexible way of coordinating individual policies modeled by POMDPs and have low communication requirements. Additionally, communication models in the multi-agent POMDP literature severely mismatch with real inter-robot communication. We address this issue by exploiting a decentralized data fusion method in order to efficiently maintain a joint belief state among the robots. The paper presents results in two different applications: environmental monitoring with Unmanned Aerial Vehicles (UAVs); and cooperative tracking, in which several robots have to jointly track a moving target of interest.


Capitan

AAAI Conferences

Planning under uncertainty faces a scalability problem when considering multi-robot teams, as the information space scales exponentially with the number of robots. To address this issue, this paper proposes to decentralize multi-robot Partially Observable Markov Decision Processes (POMDPs) while maintaining cooperation between robots by using POMDP policy auctions. Auctions provide a flexible way of coordinating individual policies modeled by POMDPs and have low communication requirements.


Roijers

AAAI Conferences

Planning under uncertainty poses a complex problem in which multiple objectives often need to be balanced. When dealing with multiple objectives, it is often assumed that the relative importance of the objectives is known a priori. However, in practice human decision makers often find it hard to specify such preferences, and would prefer a decision support system that presents a range of possible alternatives. We propose two algorithms for computing these alternatives for the case of linearly weighted objectives. First, we propose an anytime method, approximate optimistic linear support (AOLS), that incrementally builds up a complete set of ε-optimal plans, exploiting the piecewise linear and convex shape of the value function. Second, we propose an approximate anytime method, scalarised sample incremental improvement (SSII), that employs weight sampling to focus on the most interesting regions in weight space, as suggested by a prior over preferences. We show empirically that our methods are able to produce (near-)optimal alternative sets orders of magnitude faster than existing techniques.


GSMDPs for Multi-Robot Sequential Decision-Making

AAAI Conferences

Markov Decision Processes (MDPs) provide an extensive theoretical background for problems of decision-making under uncertainty. In order to maintain computational tractability, however, real-world problems are typically discretized in states and actions as well as in time. Assuming synchronous state transitions and actions at fixed rates may result in models which are not strictly Markovian, or where agents are forced to idle between actions, losing their ability to react to sudden changes in the environment. In this work, we explore the application of Generalized Semi-Markov Decision Processes (GSMDPs) to a realistic multi-robot scenario. A case study will be presented in the domain of cooperative robotics, where real-time reactivity must be preserved, and synchronous discrete-time approaches are therefore sub-optimal. This case study is tested on a team of real robots, and also in realistic simulation. By allowing asynchronous events to be modeled over continuous time, the GSMDP approach is shown to provide greater solution quality than its discrete-time counterparts, while still being approximately solvable by existing methods.