Goto

Collaborating Authors

 Agents




OnBlameAttributionforAccountableMulti-Agent SequentialDecisionMaking

Neural Information Processing Systems

Blame attribution isoneofthekeyaspects ofaccountable decision making, asit provides means to quantify the responsibility of an agent for a decision making outcome. Inthis paper,we study blame attribution inthe contextof cooperative multi-agent sequential decision making.


CalibrationofSharedEquilibriainGeneralSum PartiallyObservableMarkovGames

Neural Information Processing Systems

We consider a general sum partially observableMarkovgamewhere agents ofdifferent types share asingle policy network, conditioned on agent-specific information. This paper aims at i) formally understanding equilibria reached by such agents, and ii) matching emergent phenomena ofsuch equilibria toreal-worldtargets. Parameter sharing with decentralized execution has been introduced as an efficient way to train multiple agents using a single policy network.


6af779991368999ab3da0d366c208fba-Paper-Conference.pdf

Neural Information Processing Systems

Planning enables autonomous agents to solve complex decision-making problems by evaluating predictions of the future. However, classical planning algorithms often become infeasible in real-world settings where state spaces are high-dimensional andtransitiondynamicsunknown.



FindingRegionsofHeterogeneityinDecision-Making viaExpectedConditionalCovariance

Neural Information Processing Systems

Individuals often make different decisions when faced with the same context, due to personal preferences and background. For instance, judges may vary in their leniency towards certain drug-related offenses, and doctors may vary in their preference for how to start treatment for certain types of patients.


Multi-AgentReinforcementLearningis ASequenceModelingProblem

Neural Information Processing Systems

Recently, such difficulty in multi-agent learning has been eased owing to the introduction ofcentralized training for decentralized execution(CTDE) [11, 45], which allows agents to access the global information andopponents' actions during thetraining phase.