Agent Societies
Supplementary Materials of The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games
We assume here that all agents share critic and actor networks, for notational convenience. Gaussian Distribution, from which an action is sampled, in continuous action spaces. In the loss functions above, B refers to the batch size and n refers to the number of agents. Multi-agent Particle-World Environment (MPE) was introduced in (Lowe et al., 2017). StarCraftII Micromanagement Challenge (SMAC) tasks were introduced in (Rashid et al., 2019).
EvolveGraph: Multi-Agent Trajectory Prediction with Dynamic Relational Reasoning
Multi-agent interacting systems are prevalent in the world, from purely physical systems to complicated social dynamic systems. In many applications, effective understanding of the situation and accurate trajectory prediction of interactive agents play a significant role in downstream tasks, such as decision making and planning.
Strategic Behavior is Bliss: Iterative Voting Improves Social Welfare
Recent work in iterative voting has defined the additive dynamic price of anarchy (ADPoA) as the difference in social welfare between the truthful and worst-case equilibrium profiles resulting from repeated strategic manipulations. While iterative plurality has been shown to only return alternatives with at most one less initial votes than the truthful winner, it is less understood how agents' welfare changes in equilibrium. To this end, we differentiate agents' utility from their manipulation mechanism and determine iterative plurality's ADPoA in the worst-and average-cases. We first prove that the worst-case ADPoA is linear in the number of agents. To overcome this negative result, we study the average-case ADPoA and prove that equilibrium winners have a constant order welfare advantage over the truthful winner in expectation. Our positive results illustrate the prospect for social welfare to increase due to strategic manipulation.