Tambe, Milind

Mitigating the Curse of Correlation in Security Games by Entropy Maximization

arXiv.org Artificial Intelligence

In Stackelberg security games, a defender seeks to randomly allocate limited security resources to protect critical targets from an attack. In this paper, we study a fundamental, yet underexplored, phenomenon in security games, which we term the \emph{Curse of Correlation} (CoC). Specifically, we observe that there are inevitable correlations among the protection status of different targets. Such correlation is a crucial concern, especially in \emph{spatio-temporal} domains like conservation area patrolling, where attackers can surveil patrollers at certain areas and then infer their patrolling routes using such correlations. To mitigate this issue, we propose to design entropy-maximizing defending strategies for spatio-temporal security games, which frequently suffer from CoC. We prove that the problem is \#P-hard in general. However, it admits efficient algorithms in well-motivated special settings. Our experiments show significant advantages of max-entropy algorithms over previous algorithms. A scalable implementation of our algorithm is currently under pre-deployment testing for integration into FAMS software to improve the scheduling of US federal air marshals.

PSINET: Assisting HIV Prevention Amongst Homeless Youth by Planning Ahead

AI Magazine

Homeless youth are prone to Human Immunodeficiency Virus (HIV) due to their engagement in high risk behavior such as unprotected sex, sex under influence of drugs, and so on. Many non-profit agencies conduct interventions to educate and train a select group of homeless youth about HIV prevention and treatment practices and rely on word-of-mouth spread of information through their social network. Previous work in strategic selection of intervention participants does not handle uncertainties in the social network’s structure and evolving network state, potentially causing significant shortcomings in spread of information. Thus, we developed PSINET, a decision support system to aid the agencies in this task. PSINET includes the following key novelties: (1) it handles uncertainties in network structure and evolving network state; (2) it addresses these uncertainties by using POMDPs in influence maximization; and (3) it provides algorithmic advances to allow high quality approximate solutions for such POMDPs. Simulations show that PSINET achieves around 60 percent more information spread over the current state of the art. PSINET was developed in collaboration with My Friend’s Place (a drop-in agency serving homeless youth in Los Angeles) and is currently being reviewed by their officials.

One Size Does Not Fit All: A Game-Theoretic Approach for Dynamically and Effectively Screening for Threats

AAAI Conferences

An effective way of preventing attacks in secure areas is to screen for threats (people, objects) before entry, e.g., screening of airport passengers. However, screening every entity at the same level may be both ineffective and undesirable. The challenge then is to find a dynamic approach for randomized screening, allowing for more effective use of limited screening resources, leading to improved security. We address this challenge with the following contributions: (1) a threat screening game (TSG) model for general screening domains; (2) an NP-hardness proof for computing the optimal strategy of TSGs; (3) a scheme for decomposing TSGs into subgames to improve scalability; (4) a novel algorithm that exploits a compact game representation to efficiently solve TSGs, providing the optimal solution under certain conditions; and (5) an empirical comparison of our proposed algorithm against the current state-of-the-art optimal approach for large-scale game-theoretic resource allocation problems.

Learning Adversary Behavior in Security Games: A PAC Model Perspective

arXiv.org Artificial Intelligence

Recent applications of Stackelberg Security Games (SSG), from wildlife crime to urban crime, have employed machine learning tools to learn and predict adversary behavior using available data about defender-adversary interactions. Given these recent developments, this paper commits to an approach of directly learning the response function of the adversary. Using the PAC model, this paper lays a firm theoretical foundation for learning in SSGs (e.g., theoretically answer questions about the numbers of samples required to learn adversary behavior) and provides utility guarantees when the learned adversary model is used to plan the defender's strategy. The paper also aims to answer practical questions such as how much more data is needed to improve an adversary model's accuracy. Additionally, we explain a recently observed phenomenon that prediction accuracy of learned adversary behavior is not enough to discover the utility maximizing defender strategy. We provide four main contributions: (1) a PAC model of learning adversary response functions in SSGs; (2) PAC-model analysis of the learning of key, existing bounded rationality models in SSGs; (3) an entirely new approach to adversary modeling based on a non-parametric class of response functions with PAC-model analysis and (4) identification of conditions under which computing the best defender strategy against the learned adversary behavior is indeed the optimal strategy. Finally, we conduct experiments with real-world data from a national park in Uganda, showing the benefit of our new adversary modeling approach and verification of our PAC model predictions.

When Security Games Go Green: Designing Defender Strategies to Prevent Poaching and Illegal Fishing

AAAI Conferences

Building on the successful applications of Stackelberg Security Games (SSGs) to protect infrastructure, researchers have begun focusing on applying game theory to green security domains such as protection of endangered animals and fish stocks. Previous efforts in these domains optimize defender strategies based on the standard Stackelberg assumption that the adversaries become fully aware of the defender's strategy before taking action. Unfortunately, this assumption is inappropriate since adversaries in green security domains often lack the resources to fully track the defender strategy. This paper (i) introduces Green Security Games (GSGs), a novel game model for green security domains with a generalized Stackelberg assumption; (ii) provides algorithms to plan effective sequential defender strategies --- such planning was absent in previous work; (iii) proposes a novel approach to learn adversary models that further improves defender performance; and (iv) provides detailed experimental analysis of proposed approaches.

Security Games with Information Leakage: Modeling and Computation

arXiv.org Artificial Intelligence

Most models of Stackelberg security games assume that the attacker only knows the defender's mixed strategy, but is not able to observe (even partially) the instantiated pure strategy. Such partial observation of the deployed pure strategy -- an issue we refer to as information leakage -- is a significant concern in practical applications. While previous research on patrolling games has considered the attacker's real-time surveillance, our settings, therefore models and techniques, are fundamentally different. More specifically, after describing the information leakage model, we start with an LP formulation to compute the defender's optimal strategy in the presence of leakage. Perhaps surprisingly, we show that a key subproblem to solve this LP (more precisely, the defender oracle) is NP-hard even for the simplest of security game models. We then approach the problem from three possible directions: efficient algorithms for restricted cases, approximation algorithms, and heuristic algorithms for sampling that improves upon the status quo. Our experiments confirm the necessity of handling information leakage and the advantage of our algorithms.

Exploring Information Asymmetry in Two-Stage Security Games

AAAI Conferences

Stackelberg security games have been widely deployed to protect real-word assets. The main solution concept there is the Strong Stackelberg Equilibrium (SSE), which optimizes the defender's random allocation of limited security resources. However, solely deploying the SSE mixed strategy has limitations. In the extreme case, there are security games where the defender is able to defend all the assets ``almost perfectly" at the SSE, but she still sustains significant loss. In this paper, we propose an approach for improving the defender's utility in such scenarios. Perhaps surprisingly, our approach is to strategically reveal to the attacker information about the sampled pure strategy. Specifically, we propose a two-stage security game model, where in the first stage the defender allocates resources and the attacker selects a target to attack, and in the second stage the defender strategically reveals local information about that target, potentially deterring the attacker's attack plan. We then study how the defender can play optimally in both stages. We show, theoretically and experimentally, that the two-stage security game model allows the defender to gain strictly better utility than SSE.

Solving Zero-Sum Security Games in Discretized Spatio-Temporal Domains

AAAI Conferences

Among the many deployment areas of Stackelberg Security games, a major area involves games played out in space and time, which includes applications in multiple mobile defender resources protecting multiple mobile targets. Previous algorithms for such spatio-temporal security games fail to scale-up and little is known ofthe computational complexity properties of these problems.This paper provides a novel oracle-based algorithmic framework for a systematic study of different problem variants of computing optimal (minimax) strategies in spatio-temporal security games. Our framework enables efficient computation of a minimax strategy when the problem admits a polynomial-time oracle. Furthermore,for the cases in which efficient oracles are difficultto find, we propose approximations or prove hardness results.

Give a Hard Problem to a Diverse Team: Exploring Large Action Spaces

AAAI Conferences

Recent work has shown that diverse teams can outperform a uniform team made of copies of the best agent. However, there are fundamental questions that were not asked before. When should we use diverse or uniform teams? How does the performance change as the action space or the teams get larger? Hence, we present a new model of diversity for teams, that is more general than previous models. We prove that the performance of a diverse team improves as the size of the action space gets larger. Concerning the size of the diverse team, we show that the performance converges exponentially fast to the optimal one as we increase the number of agents. We present synthetic experiments that allow us to gain further insights: even though a diverse team outperforms a uniform team when the size of the action space increases, the uniform team will eventually again play better than the diverse team for a large enough action space. We verify our predictions in a system of Go playing agents, where we show a diverse team that improves in performance as the board size increases, and eventually overcomes a uniform team.

Computing Solutions in Infinite-Horizon Discounted Adversarial Patrolling Games

AAAI Conferences

Stackelberg games form the core of a number of tools deployed for computing optimal patrolling strategies in adversarial domains, such as the US Federal Air Marshall Service and the US Coast Guard. In traditional Stackelberg security game models the attacker knows only the probability that each target is covered by the defender, but is oblivious to the detailed timing of the coverage schedule. In many real-world situations, however, the attacker can observe the current location of the defender and can exploit this knowledge to reason about the defender’s future moves. We show that this general modeling framework can be captured using adversarial patrolling games (APGs) in which the defender sequentially moves between targets, with moves constrained by a graph, while the attacker can observe the defender’s current location and his (stochastic) policy concerning future moves. We offer a very general model of infinite-horizon discounted adversarial patrolling games. Our first contribution is to show that defender policies that condition only on the previous defense move (i.e., Markov stationary policies) can be arbitrarily suboptimal for general APGs. We then offer a mixed-integer non-linear programming (MINLP) formulation for computing optimal randomized policies for the defender that can condition on history of bounded, but arbitrary, length, as well as a mixed-integer linear programming (MILP) formulation to approximate these, with provable quality guarantees. Additionally, we present a non-linear programming (NLP) formulation for solving zero-sum APGs. We show experimentally that MILP significantly outperforms the MINLP formulation, and is, in turn, significantly outperformed by the NLP specialized to zero-sum games.