On Bellman's Optimality Principle for zs-POSGs

Buffet, Olivier, Dibangoye, Jilles, Delage, Aurélien, Saffidine, Abdallah, Thomas, Vincent

Jun-29-2020–arXiv.org Artificial Intelligence

Many non-trivial sequential decision-making problems are efficiently solved by relying on Bellman's optimality principle, i.e., exploiting the fact that sub-problems are nested recursively within the original problem. Here we show how it can apply to (infinite horizon) 2-player zero-sum partially observable stochastic games (zs-POSGs) by (i) taking a central planner's viewpoint, which can only reason on a sufficient statistic called occupancy state, and (ii) turning such problems into zero-sum occupancy Markov games (zs-OMGs). Then, exploiting the Lipschitz-continuity of the value function in occupancy space, one can derive a version of the HSVI algorithm (Heuristic Search Value Iteration) that provably finds an $\epsilon$-Nash equilibrium in finite time.

artificial intelligence, machine learning, optimization problem, (19 more...)

arXiv.org Artificial Intelligence

Jun-29-2020

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia
  - New South Wales > Sydney (0.04)
- Europe
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - Netherlands > North Holland
    - Amsterdam (0.04)
  - France > Grand Est
    - Meurthe-et-Moselle > Nancy (0.04)

Genre:
- Research Report (0.64)

Industry:
- Leisure & Entertainment > Games (1.00)

Technology:
- Information Technology
  - Game Theory (1.00)
  - Artificial Intelligence
    - Representation & Reasoning
      - Optimization (0.93)
      - Search (0.68)
    - Machine Learning > Learning Graphical Models
      - Undirected Networks > Markov Models (0.95)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found