A Background

Feb-18-2024, 04:16:58 GMT–Neural Information Processing Systems

A.1 Partially Observable Mackov Decision Process We follow previous works [25] to consider MARL as a partially observable Markov games [22]. We define a set of states S describing the possible configurations of all n agents. Then, each agent i gets rewards as a function of the state and agent's action r In the following paragraph, we use superscript to indicate agent's index and subscript to indicate time step for states, observations, rewards and actions. A.2 Decision Transformer Decision Transformer [3] using Transformer [44] which is an architecture to efficiently model sequential data shows its ability to cast the problem of RL as conditional sequence modeling. The core component of transformer is attention mechanism [44].

agent, implementation, offline dataset, (13 more...)

Neural Information Processing Systems

Feb-18-2024, 04:16:58 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Agents (1.00)
  - Machine Learning (1.00)