Goto

Collaborating Authors

 Agents



We first thank all reviewers for their thoughtful comments, and we wish everyone health during these hard times

Neural Information Processing Systems

We first thank all reviewers for their thoughtful comments, and we wish everyone health during these hard times. We acknowledge the simplicity in our linear demand and reference price update models. These references are also discussed in Section 2 of the paper. The gradient of revenue can be calculated using estimated elasticity, observed sales (i.e. Assumption 1 is invoked in all theorems and lemmas of Section 5, and we will clearly state this in the revised paper. In the proof of Lemma 3.2, we show that This means if firms are willing to consider both prices near zero and those sufficiently large, Assumption 1 holds.




Polynomial-Time Optimal Equilibria with a Mediator in Extensive-Form Games

Neural Information Processing Systems

For common notions of correlated equilibrium in extensive-form games, computing an optimal ( e.g., welfare-maximizing) equilibrium is NP-hard. Other equilibrium notions-- communication [11] and certification [12] equilibria--augment the game with a mediator that has the power to both send and receive messages to and from the players--and, in particular, to remember the messages. In this paper, we investigate both notions in extensive-form games from a computational lens. We show that optimal equilibria in both notions can be computed in polynomial time, the latter under a natural additional assumption known in the literature. Our proof works by constructing a mediator-augmented game of polynomial size that explicitly represents the mediator's decisions and actions.



Supplementary Materials of The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games

Neural Information Processing Systems

We assume here that all agents share critic and actor networks, for notational convenience. Gaussian Distribution, from which an action is sampled, in continuous action spaces. In the loss functions above, B refers to the batch size and n refers to the number of agents. Multi-agent Particle-World Environment (MPE) was introduced in (Lowe et al., 2017). StarCraftII Micromanagement Challenge (SMAC) tasks were introduced in (Rashid et al., 2019).