Agents
Computing Nash Equilibria in Generalized Interdependent Security Games
We study the computational complexity of computing Nash equilibria in generalized interdependent-security (IDS) games. Like traditional IDS games, originally introduced by economists and risk-assessment experts Heal and Kunreuther about a decade ago, generalized IDS games model agentsโ voluntary investment decisions when facing potential direct risk and transfer risk exposure from other agents. A distinct feature of generalized IDS games, however, is that full investment can reduce transfer risk. As a result, depending on the transfer-risk reduction level, generalized IDS games may exhibit strategic complementarity (SC) or strategic substitutability (SS). We consider three variants of generalized IDS games in which players exhibit only SC, only SS, and both SC+SS. We show that determining whether there is a pure-strategy Nash equilibrium (PSNE) in SC+SS-type games is NP-complete, while computing a single PSNE in SC-type games takes worst-case polynomial time. As for the problem of computing all mixed-strategy Nash equilibria (MSNE) efficiently, we produce a partial characterization. Whenever each agent in the game is indiscriminate in terms of the transfer-risk exposure to the other agents, a case that Kearns and Ortiz originally studied in the context of traditional IDS games in their NIPS 2003 paper, we can compute all MSNE that satisfy some ordering constraints in polynomial time in all three game variants. Yet, there is a computational barrier in the general (transfer) case: we show that the computational problem is as hard as the Pure-Nash-Extension problem, also originally introduced by Kearns and Ortiz, and that it is NP complete for all three variants. Finally, we experimentally examine and discuss the practical impact that the additional protection from transfer risk allowed in generalized IDS games has on MSNE by solving several randomly-generated instances of SC+SS-type games with graph structures taken from several real-world datasets.
Fairness in Multi-Agent Sequential Decision-Making
Zhang, Chongjie, Shah, Julie A.
We define a fairness solution criterion for multi-agent decision-making problems, where agents have local interests. This new criterion aims to maximize the worst performance of agents with consideration on the overall performance. We develop a simple linear programming approach and a more scalable game-theoretic approach for computing an optimal fairness policy. This game-theoretic approach formulates this fairness optimization as a two-player, zero-sum game and employs an iterative algorithm for finding a Nash equilibrium, corresponding to an optimal fairness policy. We scale up this approach by exploiting problem structure and value function approximation. Our experiments on resource allocation problems show that this fairness criterion provides a more favorable solution than the utilitarian criterion, and that our game-theoretic approach is significantly faster than linear programming.
Diverse Randomized Agents Vote to Win
Jiang, Albert, Marcolino, Leandro Soriano, Procaccia, Ariel D., Sandholm, Tuomas, Shah, Nisarg, Tambe, Milind
We investigate the power of voting among diverse, randomized software agents. With teams of computer Go agents in mind, we develop a novel theoretical model of two-stage noisy voting that builds on recent work in machine learning. This model allows us to reason about a collection of agents with different biases (determined by the first-stage noise models), which, furthermore, apply randomized algorithms to evaluate alternatives and produce votes (captured by the second-stage noise models). We analytically demonstrate that a uniform team, consisting of multiple instances of any single agent, must make a significant number of mistakes, whereas a diverse team converges to perfection as the number of agents grows. Our experiments, which pit teams of computer Go agents against strong agents, provide evidence for the effectiveness of voting when agents are diverse.
An Exact Double-Oracle Algorithm for Zero-Sum Extensive-Form Games with Imperfect Information
Bosansky, B., Kiekintveld, C., Lisy, V., Pechoucek, M.
Developing scalable solution algorithms is one of the central problems in computational game theory. We present an iterative algorithm for computing an exact Nash equilibrium for two-player zero-sum extensive-form games with imperfect information. Our approach combines two key elements: (1) the compact sequence-form representation of extensive-form games and (2) the algorithmic framework of double-oracle methods. The main idea of our algorithm is to restrict the game by allowing the players to play only selected sequences of available actions. After solving the restricted game, new sequences are added by finding best responses to the current solution using fast algorithms. We experimentally evaluate our algorithm on a set of games inspired by patrolling scenarios, board, and card games. The results show significant runtime improvements in games admitting an equilibrium with small support, and substantial improvement in memory use even on games with large support. The improvement in memory use is particularly important because it allows our algorithm to solve much larger game instances than existing linear programming methods. Our main contributions include (1) a generic sequence-form double-oracle algorithm for solving zero-sum extensive-form games; (2) fast methods for maintaining a valid restricted game model when adding new sequences; (3) a search algorithm and pruning methods for computing best-response sequences; (4) theoretical guarantees about the convergence of the algorithm to a Nash equilibrium; (5) experimental analysis of our algorithm on several games, including an approximate version of the algorithm.
A DDoS-Aware IDS Model Based on Danger Theory and Mobile Agents
Zamani, Mahdi, Movahedi, Mahnush, Ebadzadeh, Mohammad, Pedram, Hossein
We propose an artificial immune model for intrusion detection in distributed systems based on a relatively recent theory in immunology called Danger theory. Based on Danger theory, immune response in natural systems is a result of sensing corruption as well as sensing unknown substances. In contrast, traditional self-nonself discrimination theory states that immune response is only initiated by sensing nonself (unknown) patterns. Danger theory solves many problems that could only be partially explained by the traditional model. Although the traditional model is simpler, such problems result in high false positive rates in immune-inspired intrusion detection systems. We believe using danger theory in a multi-agent environment that computationally emulates the behavior of natural immune systems is effective in reducing false positive rates. We first describe a simplified scenario of immune response in natural systems based on danger theory and then, convert it to a computational model as a network protocol. In our protocol, we define several immune signals and model cell signaling via message passing between agents that emulate cells. Most messages include application-specific patterns that must be meaningfully extracted from various system properties. We show how to model these messages in practice by performing a case study on the problem of detecting distributed denial-of-service attacks in wireless sensor networks. We conduct a set of systematic experiments to find a set of performance metrics that can accurately distinguish malicious patterns. The results indicate that the system can be efficiently used to detect malicious patterns with a high level of accuracy.
The Computational Theory of Intelligence: Information Entropy
This paper attempts to introduce a computational approach to the study of intelligence that the researcher has accumulated over years of study. This approach takes into account data from psychology, neurology, artificial intelligence, machine learning, and mathematics. Central to this framework is the fact that the goal of any intelligent agent is to reduce the randomness in its environment in some meaningful way. Of course, formal definitions in the context of this paper for terms like "intelligence", "environment", and "agent" will follow. The approach draws from multidisciplinary research and has many applications. We will utilize the construct in discussions at the end of the paper. Other applications will follow in future works. Implementations of this framework can apply to many fields of study including general artificial intelligence (GAI), machine learning, optimization, information gathering, clustering, and big data, and extend outside of the applied mathematics and computer science realm to even more areas including sociology, psychology, and neurology, and even philosophy.
Reinforcement Learning and Nonparametric Detection of Game-Theoretic Equilibrium Play in Social Networks
Gharehshiran, Omid Namvar, Hoiles, William, Krishnamurthy, Vikram
The first part of the paper presents a reinforcement learning (adaptive filtering) algorithm that facilitates learning an equilibrium by resorting to diffusion cooperation strategies in a social network. Agents form homophilic social groups, within which they exchange past experiences over an undirected graph. It is shown that, if all agents follow the proposed algorithm, their global behavior is attracted to the correlated equilibria set of the game. The second part of the paper provides a test to detect if the actions of agents are consistent with play from the equilibrium of a concave potential game. The theory of revealed preference from microeconomics is used to construct a nonparametric decision test and statistical test which only require the probe and associated actions of agents. A stochastic gradient algorithm is given to optimize the probe in real time to minimize the Type-II error probabilities of the detection test subject to specified Type-I error probability. We provide a real-world example using the energy market, and a numerical example to detect malicious agents in an online social network. Index Terms--Multi-agent signal processing, non-cooperative games, social networks, correlated equilibrium, diffusion cooperation, homophily behavior, revealed preferences, Afriat's theorem, stochastic approximation algorithm.
Projective simulation for classical learning agents: a comprehensive investigation
Mautner, Julian, Makmal, Adi, Manzano, Daniel, Tiersch, Markus, Briegel, Hans J.
We study the model of projective simulation (PS), a novel approach to artificial intelligence based on stochastic processing of episodic memory which was recently introduced [H.J. Briegel and G. De las Cuevas. Sci. Rep. 2, 400, (2012)]. Here we provide a detailed analysis of the model and examine its performance, including its achievable efficiency, its learning times and the way both properties scale with the problems' dimension. In addition, we situate the PS agent in different learning scenarios, and study its learning abilities. A variety of new scenarios are being considered, thereby demonstrating the model's flexibility. Furthermore, to put the PS scheme in context, we compare its performance with those of Q-learning and learning classifier systems, two popular models in the field of reinforcement learning. It is shown that PS is a competitive artificial intelligence model of unique properties and strengths.
No Agent Left Behind: Dynamic Fair Division of Multiple Resources
Kash, I., Procaccia, A. D., Shah, N.
Recently fair division theory has emerged as a promising approach for allocation of multiple computational resources among agents. While in reality agents are not all present in the system simultaneously, previous work has studied static settings where all relevant information is known upfront. Our goal is to better understand the dynamic setting. On the conceptual level, we develop a dynamic model of fair division, and propose desirable axiomatic properties for dynamic resource allocation mechanisms. On the technical level, we construct two novel mechanisms that provably satisfy some of these properties, and analyze their performance using real data. We believe that our work informs the design of superior multiagent systems, and at the same time expands the scope of fair division theory by initiating the study of dynamic and fair resource allocation mechanisms.
Distributed Policy Evaluation Under Multiple Behavior Strategies
Macua, Sergio Valcarcel, Chen, Jianshu, Zazo, Santiago, Sayed, Ali H.
We apply diffusion strategies to develop a fully-distributed cooperative reinforcement learning algorithm in which agents in a network communicate only with their immediate neighbors to improve predictions about their environment. The algorithm can also be applied to off-policy learning, meaning that the agents can predict the response to a behavior different from the actual policies they are following. The proposed distributed strategy is efficient, with linear complexity in both computation time and memory footprint. We provide a mean-square-error performance analysis and establish convergence under constant step-size updates, which endow the network with continuous learning capabilities. The results show a clear gain from cooperation: when the individual agents can estimate the solution, cooperation increases stability and reduces bias and variance of the prediction error; but, more importantly, the network is able to approach the optimal solution even when none of the individual agents can (e.g., when the individual behavior policies restrict each agent to sample a small portion of the state space).