Goto

Collaborating Authors

 Agents


Calibration of Shared Equilibria in General Sum Partially Observable Markov Games

Neural Information Processing Systems

This paper aims at i) formally understanding equilibria reached by such agents, and ii) matching emergent phenomena of such equilibria to real-world targets. Parameter sharing with decentralized execution has been introduced as an efficient way to train multiple agents using a single policy network.


3dd48ab31d016ffcbf3314df2b3cb9ce-Reviews.html

Neural Information Processing Systems

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The paper considers global and local path planning for multiple agents in 2-D with a centralized message-passing algorithm derived from the three-weight version of ADMM, an established algorithm. The contributions are clearly stated in the introduction: The authors decompose global planning optimization into several sub-problems they dub minimizers, which describe various planning objectives that comprise the larger overall problem to be solved. Minimizers are derived for avoiding inter-agent collisions, avoiding collisions with static obstacles, and for maximizing/minimizing kinetic energy or velocity. They also apply their approach to local planning by reformulating joint optimization.



7f2be1b45d278ac18804b79207a24c53-AuthorFeedback.pdf

Neural Information Processing Systems

We thank the reviewers for their insightful feedback. We address reviewer comments below and begin by situating the paper's intended contribution: Why is this our goal? POMDP planners incur the complexity of full, closed-loop planning only when necessary. V oI is "contrary to the core concept of POMDPs", V oI macro-actions expand the set of problems that can be efficiently What is not our goal? The primary critique of reviewers is the limited scope of our experimental results.


The Smoothed Possibility of Social Choice

Neural Information Processing Systems

We develop a framework that leverages the smoothed complexity analysis by Spielman and Teng [60] to circumvent paradoxes and impossibility theorems in social choice, motivated by modern applications of social choice powered by AI and ML. For Condrocet's paradox, we prove that the smoothed likelihood of the paradox either vanishes at an exponential rate as the number of agents increases, or does not vanish at all. For the ANR impossibility on the non-existence of voting rules that simultaneously satisfy anonymity, neutrality, and resolvability, we characterize the rate for the impossibility to vanish, to be either polynomially fast or exponentially fast. We also propose a novel easy-to-compute tie-breaking mechanism that optimally preserves anonymity and neutrality for even number of alternatives in natural settings. Our results illustrate the smoothed possibility of social choice--even though the paradox and the impossibility theorem hold in the worst case, they may not be a big concern in practice.




Shared Experience Actor-Critic for Multi-Agent Reinforcement Learning

Neural Information Processing Systems

Exploration in multi-agent reinforcement learning is a challenging problem, especially in environments with sparse rewards. We propose a general method for efficient exploration by sharing experience amongst agents.