Solving infinite-horizon Dec-POMDPs using Finite State Controllers within JESP

You, Yang, Thomas, Vincent, Colas, Francis, Buffet, Olivier

Sep-17-2021–arXiv.org Artificial Intelligence

This paper looks at solving collaborative planning problems formalized as Decentralized POMDPs (Dec-POMDPs) by searching for Nash equilibria, i.e., situations where each agent's policy is a best response to the other agents' (fixed) policies. While the Joint Equilibrium-based Search for Policies (JESP) algorithm does this in the finite-horizon setting relying on policy trees, we propose here to adapt it to infinite-horizon Dec-POMDPs by using finite state controller (FSC) policy representations. In this article, we (1) explain how to turn a Dec-POMDP with $N-1$ fixed FSCs into an infinite-horizon POMDP whose solution is an $N^\text{th}$ agent best response; (2) propose a JESP variant, called \infJESP, using this to solve infinite-horizon Dec-POMDPs; (3) introduce heuristic initializations for JESP aiming at leading to good solutions; and (4) conduct experiments on state-of-the-art benchmark problems to evaluate our approach.

agent, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

Sep-17-2021

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.61)
- Oceania > Australia
  - Australian Capital Territory > Canberra (0.04)
- Europe > France
  - Grand Est > Meurthe-et-Moselle > Nancy (0.04)

Genre:
- Research Report (0.64)

Industry:
- Government > Regional Government > North America Government > United States Government (0.61)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)