SupplementaryMaterialsof TheSurprisingEffectivenessofPPOinCooperative Multi-AgentGames
–Neural Information Processing Systems
We consider the 3 fully cooperative tasks from the original set shown in Figure 1(a):Spread, Comm,andReference. "Use feature normalization" refers to whether the feature normalization is applied to the networkinput. In this appendix section, we include results which demonstrate the benefit of parameter sharing. Note that our global state to the value network has agent-specific information, such as available actions and relative distances to other agents. When an agent dies, these agent-specific features become zero, while the remaining agent-agnostic features remain nonzero -this leads to adrastic distribution shift in the critic input compared to states in which the agent is alive.
Neural Information Processing Systems
Feb-11-2026, 00:26:33 GMT