Agents
A Simple Polynomial-Time Randomized Distributed Algorithm for Connected Row Convex Constraints
Kumar, T. K. Satish (University of Southern California) | Nguyen, Duc Thien (Singapore Management University) | Yeoh, William (New Mexico State University) | Koenig, Sven (University of Southern California)
In this paper, we describe a simple randomized algorithm that runs in polynomial time and solves connected row convex (CRC) constraints in distributed settings. CRC constraints generalize many known tractable classes of constraints like 2-SAT and implicational constraints. They can model problems in many domains including temporal reasoning and geometric reasoning, and generally speaking, play the role of ``Gaussians'' in the logical world. Our simple randomized algorithm for solving them in distributed settings, therefore, has a number of important applications. We support our claims through a theoretical analysis and empirical results.
Evolutionary Dynamics of Q-Learning over the Sequence Form
Panozzo, Fabio (Politecnico di Milano) | Gatti, Nicola (Politecnico di Milano) | Restelli, Marcello (Politecnico di Milano)
Multi-agent learning is a challenging open task in artificial intelligence. It is known an interesting connection between multi-agent learning algorithms and evolutionary game theory, showing that the learning dynamics of some algorithms can be modeled as replicator dynamics with a mutation term. Inspired by the recent sequence-form replicator dynamics, we develop a new version of the Q-learning algorithm working on the sequence form of an extensive-form game allowing thus an exponential reduction of the dynamics length w.r.t. those of the normal form. The dynamics of the proposed algorithm can be modeled by using the sequence-form replicator dynamics with a mutation term. We show that, although sequence-form and normal-form replicator dynamics are realization equivalent, the Q-learning algorithm applied to the two forms have non-realization equivalent dynamics. Originally from the previous works on evolutionary game theory models form multi-agent learning, we produce an experimental evaluation to show the accuracy of the model.
Scalable Complex Contract Negotiation with Structured Search and Agenda Management
Zhang, Xiaoqin Shelley (University of Massachusetts Dartmouth) | Klein, Mark (Massachusetts Institute of Technology) | Marsa-Maestre, Ivan (Assistant Professor, University of Alcala)
A large number of interdependent issues in complex contract negotiation poses a significant challenge for current approaches, which becomes even more apparent when negotiation problems scale up. To address this challenge, we present a structured anytime search process with an agenda management mechanism using a hierarchical negotiation model, where agents search at various levels during the negotiation with the guidance of a mediator. This structured negotiation process increases computational efficiency, making negotiations scalable for large number of interdependent issues. To validate the contributions of our approach, 1) we developed our proposed negotiation model using a hierarchical problem structure and a constraint-based preference model for real-world applications; 2) we defined a scenario matrix to capture various characteristics of negotiation scenarios and developed a scenario generator that produces test cases according to this matrix; and 3) we performed an extensive set of experiments to study the performance of this structured negotiation protocol and the influence of different scenario parameters, and investigated the Pareto efficiency and social welfare optimality of the negotiation outcomes. The experimental result supports the hypothesis that this hierarchical negotiation approach greatly improves scalability with the complexity of the negotiation scenarios.
Solving Zero-Sum Security Games in Discretized Spatio-Temporal Domains
Xu, Haifeng (University of Southern California) | Fang, Fei (University of Southern California) | Jiang, Albert Xin (University of Southern California) | Conitzer, Vincent (Duke University) | Dughmi, Shaddin (University of Southern California) | Tambe, Milind (University of Southern California)
Among the many deployment areas of Stackelberg Security games, a major area involves games played out in space and time, which includes applications in multiple mobile defender resources protecting multiple mobile targets. Previous algorithms for such spatio-temporal security games fail to scale-up and little is known ofthe computational complexity properties of these problems.This paper provides a novel oracle-based algorithmic framework for a systematic study of different problem variants of computing optimal (minimax) strategies in spatio-temporal security games. Our framework enables efficient computation of a minimax strategy when the problem admits a polynomial-time oracle. Furthermore,for the cases in which efficient oracles are difficultto find, we propose approximations or prove hardness results.
Regret-Based Multi-Agent Coordination with Uncertain Task Rewards
Wu, Feng (University of Southampton) | Jennings, Nicholas R. (University of Southampton)
Many multi-agent coordination problems can be represented as DCOPs. Motivated by task allocation in disaster response, we extend standard DCOP models to consider uncertain task rewards where the outcome of completing a task depends on its current state, which is randomly drawn from unknown distributions. The goal of solving this problem is to find a solution for all agents that minimizes the overall worst-case loss. This is a challenging problem for centralized algorithms because the search space grows exponentially with the number of agents and is nontrivial for existing algorithms for standard DCOPs. To address this, we propose a novel decentralized algorithm that incorporates Max-Sum with iterative constraint generation to solve the problem by passing messages among agents. By so doing, our approach scales well and can solve instances of the task allocation problem with hundreds of agents and tasks.
Give a Hard Problem to a Diverse Team: Exploring Large Action Spaces
Marcolino, Leandro Soriano (University of Southern California) | Xu, Haifeng (University of Southern California) | Jiang, Albert Xin (University of Southern California) | Tambe, Milind (University of Southern California) | Bowring, Emma (University of the Pacific)
Recent work has shown that diverse teams can outperform a uniform team made of copies of the best agent. However, there are fundamental questions that were not asked before. When should we use diverse or uniform teams? How does the performance change as the action space or the teams get larger? Hence, we present a new model of diversity for teams, that is more general than previous models. We prove that the performance of a diverse team improves as the size of the action space gets larger. Concerning the size of the diverse team, we show that the performance converges exponentially fast to the optimal one as we increase the number of agents. We present synthetic experiments that allow us to gain further insights: even though a diverse team outperforms a uniform team when the size of the action space increases, the uniform team will eventually again play better than the diverse team for a large enough action space. We verify our predictions in a system of Go playing agents, where we show a diverse team that improves in performance as the board size increases, and eventually overcomes a uniform team.
Multiagent Metareasoning through Organizational Design
Sleight, Jason (University of Michigan) | Durfee, Edmund H. (University of Michigan)
We formulate an approach to multiagent metareasoning that uses organizational design to focus each agent's reasoning on the aspects of its local problem that let it make the most worthwhile contributions to joint behavior. By employing the decentralized Markov decision process framework, we characterize an organizational design problem that explicitly considers the quantitative impact that a design has on both the quality of the agents' behaviors and their reasoning costs. We describe an automated organizational design process that can approximately solve our organizational design problem via incremental search, and present techniques that efficiently estimate the incremental impact of a candidate organizational influence. Our empirical evaluation confirms that our process generates organizational designs that impart a desired metareasoning regime upon the agents.
Theory of Cooperation in Complex Social Networks
Ranjbar-Sahraei, Bijan (Maastricht University) | Ammar, Haitham Bou (University of Pennsylvania) | Bloembergen, Daan (Maastricht University) | Tuyls, Karl (University of Liverpool) | Weiss, Gerhard (Maastricht University)
This paper presents a theoretical as well as empirical study on the evolution of cooperation on complex social networks, following the continuous action iterated prisoner's dilemma (CAIPD) model. In particular, convergence to network-wide agreement is proven for both evolutionary networks with fixed interaction dynamics, as well as for coevolutionary networks where these dynamics change over time. Moreover, an extension to the CAIPD model is proposed that allows to model influence on the evolution of cooperation in social networks. As such, this work contributes to a better understanding of behavioral change on social networks, and provides a first step towards their active control.
Online (Budgeted) Social Choice
Oren, Joel (University of Toronto) | Lucier, Brendan (Microsoft Research, New England)
We consider a classic social choice problem in an online setting. In each round, a decision maker observes a single agent's preferences overa set of $m$ candidates, and must choose whether to irrevocably add a candidate to a selection set of limited cardinality $k$. Each agent's (positional) score depends on the candidates in the set when he arrives, and the decision-maker's goal is to maximize average (over all agents) score. We prove that no algorithm (even randomized) can achieve an approximationfactor better than $O(\frac{\log\log m}{\log m})$. In contrast, if the agents arrive in random order, we present a $(1 - \frac{1}{e} - o(1))$-approximatealgorithm, matching a lower bound for the off-line problem.We show that improved performance is possible for natural input distributionsor scoring rules. Finally, if the algorithm is permitted to revoke decisions at a fixedcost, we apply regret-minimization techniques to achieve approximation $1 - \frac{1}{e} - o(1)$ even for arbitrary inputs.
Congestion Games for V2G-Enabled EV Charging
Lutati, Benny (Ben-Gurion University of the Negev) | Levit, Vadim (Ben-Gurion University of the Negev) | Grinshpoun, Tal (Ariel University) | Meisels, Amnon (Ben-Gurion University of the Negev)
A model of the problem of charging and discharging electrical vehicles as a congestion game is presented. A generalization of congestion games - feedback congestion games (FCG) - is introduced. The charging of grid-integrated vehicles, which can also discharge energy back to the grid, is a natural FCG application. FCGs are proven to be exact potential games and therefore converge to a pure-strategy Nash equilibrium by an iterated better-response process. A compact representation and an algorithm that enable efficient best-response search are presented. A detailed empirical evaluation assesses the performance of the iterated best-response process. The evaluation considers the quality of the resulting solutions and the rate of convergence to a stable state. The effect of allowing to also discharge batteries using FCG is compared to scenarios that only include charging and is found to dramatically improve the predictability of the achieved solutions as well as the balancing of load.