In-Context Fully Decentralized Cooperative Multi-Agent Reinforcement Learning

Jun-22-2026, 22:27:55 GMT–Neural Information Processing Systems

In this paper, we consider fully decentralized cooperative multi-agent reinforcement learning, where each agent has access only to the states, its local actions, and the shared rewards. The absence of information about other agents' actions typically leads to the non-stationarity problem during per-agent value function updates, and the relative overgeneralization issue during value function estimation. However, existing works fail to address both issues simultaneously, as they lack the capability to model the agents' joint policy in a fully decentralized setting. To overcome this limitation, we propose a simple yet effective method named Return-Aware Context (RAC). RAC formalizes the dynamically changing task, as locally perceived by each agent, as a contextual Markov Decision Process (MDP), and addresses both nonstationarity and relative overgeneralization through return-aware context modeling. Specifically, the contextual MDP attributes the non-stationary local dynamics of each agent to switches between contexts, each corresponding to a distinct joint policy.

agent, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Jun-22-2026, 22:27:55 GMT

Conferences PDF

Add feedback

Country:
- Asia > China (0.28)

Genre:
- Research Report > Experimental Study (1.00)

Industry:
- Information Technology (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Agents
    - Agent Societies (0.84)
  - Machine Learning > Learning Graphical Models
    - Undirected Networks > Markov Models (0.88)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found