Offline Multi-Agent Reinforcement Learning with Knowledge Distillation

Apr-24-2026, 07:38:15 GMT–Neural Information Processing Systems

We introduce an offline multi-agent reinforcement learning (offline MARL) framework that utilizes previously collected data without additional online data collection. Our method reformulates offline MARL as a sequence modeling problem and thus builds on top of the simplicity and scalability of the Transformer architecture. In the fashion of centralized training and decentralized execution, we propose to first train a teacher policy who has the privilege to access every agent's observations, actions, and rewards. After the teacher policy has identified and recombined the "good" behavior in the dataset, we create separate student policies and distill not only the teacher policy's features but also its structural relations among different agents' features to student policies. We show that our framework significantly improves performances on a range of tasks and outperforms state-of-the-art offline MARL baselines. Furthermore, we demonstrate that the proposed method has a better convergence rate, is more sample efficient, and is more robust to various demonstration qualities compared with baselines.

distillation, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Apr-24-2026, 07:38:15 GMT

Conferences PDF

Add feedback

Country:
- North America (0.28)

Genre:
- Research Report (0.46)

Industry:
- Education (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Agents (1.00)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Neural Networks > Deep Learning (0.66)

Duplicate Docs Excel Report

Title
Offline Multi-Agent Reinforcement Learning with Knowledge Distillation, Lin Yen-Chen

Similar Docs Excel Report more

Title	Similarity	Source
None found