Policy Evaluation in Decentralized POMDPs with Belief Sharing

Kayaalp, Mert, Ghadieh, Fatima, Sayed, Ali H.

May-16-2023–arXiv.org Artificial Intelligence

Most works on multi-agent reinforcement learning focus on scenarios where the state of the environment is fully observable. In this work, we consider a cooperative policy evaluation task in which agents are not assumed to observe the environment state directly. Instead, agents can only have access to noisy observations and to belief vectors. It is well-known that finding global posterior distributions under multi-agent settings is generally NP-hard. As a remedy, we propose a fully decentralized belief forming strategy that relies on individual updates and on localized interactions over a communication network. In addition to the exchange of the beliefs, agents exploit the communication network by exchanging value function parameter estimates as well. We analytically show that the proposed strategy allows information to diffuse over the network, which in turn allows the agents' parameters to have a bounded difference with a centralized baseline. A multi-sensor target tracking application is considered in the simulations.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

May-16-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York > New York County
    - New York City (0.04)
  - California > San Francisco County
    - San Francisco (0.04)
- Europe
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - Switzerland > Vaud
    - Lausanne (0.04)
- Asia > Middle East
  - Jordan (0.04)
  - Lebanon > Beirut Governorate
    - Beirut (0.04)

Genre:
- Research Report (0.81)

Industry:
- Information Technology (0.46)

Technology:
- Information Technology
  - Communications > Networks (1.00)
  - Artificial Intelligence
    - Representation & Reasoning > Agents (1.00)
    - Machine Learning
      - Reinforcement Learning (1.00)
      - Neural Networks (0.92)
      - Learning Graphical Models > Undirected Networks
        Markov Models (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found