PAC: Assisted Value Factorization with Counterfactual Predictions in Multi-Agent Reinforcement Learning

Oct-11-2024, 09:10:26 GMT–Neural Information Processing Systems

Multi-agent reinforcement learning (MARL) has witnessed significant progress with the development of value function factorization methods. It allows optimizing a joint action-value function through the maximization of factorized per-agent utilities. In this paper, we show that in partially observable MARL problems, an agent's ordering over its own actions could impose concurrent constraints (across different states) on the representable function class, causing significant estimation errors during training. We tackle this limitation and propose PAC, a new framework leveraging Assistive information generated from Counterfactual Predictions of optimal joint action selection, which enable explicit assistance to value function factorization through a novel counterfactual loss. A variational inference-based information encoding method is developed to collect and encode the counterfactual predictions from an estimated baseline.

assisted value factorization, counterfactual prediction, multi-agent reinforcement learning, (2 more...)

Neural Information Processing Systems

Oct-11-2024, 09:10:26 GMT

Conferences Web Page

Add feedback

Genre:
- Play > Prospect
  - Charge (1.00)
  - Container > Reservoir (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Agents (1.00)
  - Machine Learning > Reinforcement Learning (1.00)