CalibrationofSharedEquilibriainGeneralSum PartiallyObservableMarkovGames
–Neural Information Processing Systems
We consider a general sum partially observableMarkovgamewhere agents ofdifferent types share asingle policy network, conditioned on agent-specific information. This paper aims at i) formally understanding equilibria reached by such agents, and ii) matching emergent phenomena ofsuch equilibria toreal-worldtargets. Parameter sharing with decentralized execution has been introduced as an efficient way to train multiple agents using a single policy network.
Neural Information Processing Systems
Feb-9-2026, 15:26:51 GMT