An, Bo
PAWS — A Deployed Game-Theoretic Application to Combat Poaching
Fang, Fei (Harvard University) | Nguyen, Thanh H. (University of Michigan) | Pickles, Rob (Panthera) | Lam, Wai Y. (Rimba) | Clements, Gopalasamy R. (Universiti Malaysia Terengganu) | An, Bo (Nanyang Technological University) | Singh, Amandeep (University of Pennsylvania) | Schwedock, Brian C. (University of Southern California) | Tambe, Milin (University of Southern California) | Lemieux, Andrew (The Netherlands Institute for the Study of Crime and Law Enforcement (NSCR), Netherlands)
Poaching is considered a major driver for the population drop of key species such as tigers, elephants, and rhinos, which can be detrimental to whole ecosystems. While conducting foot patrols is the most commonly used approach in many countries to prevent poaching, such patrols often do not make the best use of the limited patrolling resources.
POI2Vec: Geographical Latent Representation for Predicting Future Visitors
Feng, Shanshan (Nanyang Technological University) | Cong, Gao (Nanyang Technological University) | An, Bo (Nanyang Technological University) | Chee, Yeow Meng (Nanyang Technological University)
With the increasing popularity of location-aware social media applications, Point-of-Interest (POI) recommendation has recently been extensively studied. However, most of the existing studies explore from the users' perspective, namely recommending POIs for users. In contrast, we consider a new research problem of predicting users who will visit a given POI in a given future period. The challenge of the problem lies in the difficulty to effectively learn POI sequential transition and user preference, and integrate them for prediction. In this work, we propose a new latent representation model POI2Vec that is able to incorporate the geographical influence, which has been shown to be very important in modeling user mobility behavior. Note that existing representation models fail to incorporate the geographical influence. We further propose a method to jointly model the user preference and POI sequential transition influence for predicting potential visitors for a given POI. We conduct experiments on 2 real-world datasets to demonstrate the superiority of our proposed approach over the state-of-the-art algorithms for both next POI prediction and future user prediction.
Optimal Personalized Defense Strategy Against Man-In-The-Middle Attack
Li, Xiaohong (Tianjin University) | Li, Shuxin (Tianjin University) | Hao, Jianye (Tianjin University) | Feng, Zhiyong ( Tianjin University ) | An, Bo (Nanyang Technological University)
The Man-In-The-Middle (MITM) attack is one of the most common attacks employed in the network hacking. MITM attackers can successfully invoke attacks such as denial of service (DoS) and port stealing, and lead to surprisingly harmful consequences for users in terms of both financial loss and security issues. The conventional defense approaches mainly consider how to detect and eliminate those attacks or how to prevent those attacks from being launched in the first place. This paper proposes a game-theoretic defense strategy from a different perspective, which aims at minimizing the loss that the whole system sustains given that the MITM attacks are inevitable. We model the interaction between the attacker and the defender as a Stackelberg security game and adopt the Strong Stackelberg Equilibrium (SSE) as the defender's strategy. Since the defender's strategy space is infinite in our model, we employ a novel method to reduce the searching space of computing the optimal defense strategy. Finally, we empirically evaluate our optimal defense strategy by comparing it with non-strategic defense strategies. The results indicate that our game-theoretic defense strategy significantly outperforms other non-strategic defense strategies in terms of decreasing the total losses against MITM attacks.
Security Games on a Plane
Gan, Jiarui (Nanyang Technological University) | An, Bo (Nanyang Technological University) | Vorobeychik, Yevgeniy (Vanderbilt University) | Gauch, Brian (Vanderbilt University)
Most existing models of Stackelberg security games ignore the underlying topology of the space in which targets and defence resources are located. As a result, allocation of resources is restricted to a discrete collection of exogenously defined targets. However, in many practical security settings, defense resources can be located on a continuous plane. Better defense solutions could therefore be potentially achieved by placing resources in a space outside of actual targets (e.g., between targets). To address this limitation, we propose a model called Security Game on a Plane (SGP) in which targets are distributed on a 2-dimensional plane, and security resources, to be allocated on the same plane, protect targets within a certain effective distance. We investigate the algorithmic aspects of SGP. We find that computing a strong Stackelberg equilibrium of an SGP is NP-hard even for zero-sum games, and these are inapproximable in general. On the positive side, we find an exact solution technique for general SGPs based on an existing approach, and develop a PTAS (polynomial-time approximation scheme) for zero-sum SGP to more fundamentally overcome the computational obstacle. Our experiments demonstrate the value of considering SGP and effectiveness of our algorithms.
Revenue Maximization for Finitely Repeated Ad Auctions
Rong, Jiang (University of Chinese Academy of Sciences) | Qin, Tao (Microsoft Research) | An, Bo (Nanyang Technological University) | Liu, Tie-Yan (Microsoft Research)
Reserve price is an effective tool for revenue maximization in ad auctions. The optimal reserve price depends on bidders' value distributions, which, however, are generally unknown to auctioneers. A common practice for auctioneers is to first collect information about the value distributions by a sampling procedure and then apply the reserve price estimated with the sampled bids to the following auctions. In order to maximize the total revenue over finite auctions, it is important for the auctioneer to find a proper sample size to trade off between the cost of the sampling procedure and the optimality of the estimated reserve price. We investigate the sample size optimization problem for Generalized Second Price auctions, which is the most widely-used mechanism in ad auctions, and make three main contributions along this line. First, we bound the revenue losses in the form of competitive ratio during and after sampling. Second, we formulate the problem of finding the optimal sample size as a non-convex mixed integer optimization problem. Then we characterize the properties of the problem and prove the uniqueness of the optimal sample size. Third, we relax the integer optimization problem to a continuous form and develop an efficient algorithm based on the properties to solve it. Experimental results show that our approach can significantly improve the revenue for the auctioneer in finitely repeated ad auctions.
Strategic Information Revelation and Commitment in Security Games
Guo, Qingyu (Nanyang Technological University) | An, Bo (Nanyang Technological University) | Bosansky, Branislav (Czech Technical University in Prague) | Kiekintveld, Christopher (University of Texas at EI Paso)
The Strong Stackelberg Equilibrium (SSE) has drawn extensive attention recently in several security domains, which optimizes the defender's random allocation of limited security resources. However, the SSE concept neglects the advantage of defender's strategic revelation of her private information, and overestimates the observation ability of the adversaries. In this paper, we overcome these restrictions and analyze the tradeoff between strategic secrecy and commitment in security games. We propose a Disguised-resource Security Game (DSG) where the defender strategically disguises some of her resources. We compare strategic information revelation with public commitment and formally show that they have different advantages depending the payoff structure. To compute the Perfect Bayesian Equilibrium (PBE), several novel approaches are provided, including basic MILP formulations with mixed defender strategy and compact representation, a novel algorithm based on support set enumeration, and an approximation algorithm for epsilon-PBE. Extensive experimental evaluation shows that both strategic secrecy and Stackelberg commitment are critical measures in security domain, and our approaches can solve PBE for realistic-sized problems with good enough and robust solution quality.
Deploying PAWS to Combat Poaching: Game-Theoretic Patrolling in Areas with Complex Terrain (Demonstration)
Fang, Fei (University of Southern California) | Nguyen, Thanh H. (University of Southern California) | Pickles, Rob (Panthera) | Lam, Wai Y. (Panthera, Rimba) | Clements, Gopalasamy R. (Universiti Malaysia Terengganu) | An, Bo (Nanyang Technological University) | Singh, Amandeep (Columbia University) | Tambe, Milind (University of Southern California)
The conservation of key wildlife species such as tigers and elephants are threatened by poaching activities. In many conservation areas, foot patrols are conducted to prevent poaching but they may not be well-planned to make the best use of the limited patrolling resources. While prior work has introduced PAWS (Protection Assistant for Wildlife Security) as a game-theoretic decision aid to design effective foot patrol strategies to protect wildlife, the patrol routes generated by PAWS may be difficult to follow in areas with complex terrain. Subsequent research has worked on the significant evolution of PAWS, from an emerging application to a regularly deployed software. A key advance of the deployed version of PAWS is that it incorporates the complex terrain information and generates a strategy consisting of easy-to-follow routes. In this demonstration, we provide 1) a video introducing the PAWS system; 2) an interactive visualization of the patrol routes generated by PAWS in an example area with complex terrain; and 3) a machine-human competition in designing patrol strategy given complex terrain and animal distribution.
Optimizing Personalized Email Filtering Thresholds to Mitigate Sequential Spear Phishing Attacks
Zhao, Mengchen (Nanyang Technological University) | An, Bo (Nanyang Technological University) | Kiekintveld, Christopher (University of Texas at El Paso)
Highly targeted spear phishing attacks are increasingly common, and have been implicated in many major security breeches. Email filtering systems are the first line of defense against such attacks. These filters are typically configured with uniform thresholds for deciding whether or not to allow a message to be delivered to a user. However, users have very significant differences in both their susceptibility to phishing attacks as well as their access to critical information and credentials that can cause damage. Recent work has considered setting personalized thresholds for individual users based on a Stackelberg game model. We consider two important extensions of the previous model. First, in our model user values can be substitutable, modeling cases where multiple users provide access to the same information or credential. Second, we consider attackers who make sequential attack plans based on the outcome of previous attacks. Our analysis starts from scenarios where there is only one credential and then extends to more general scenarios with multiple credentials. For single-credential scenarios, we demonstrate that the optimal defense strategy can be found by solving a binary combinatorial optimization problem called PEDS. For multiple-credential scenarios, we formulate it as a bilevel optimization problem for finding the optimal defense strategy and then reduce it to a single level optimization problem called PEMS using complementary slackness conditions. Experimental results show that both PEDS and PEMS lead to significant higher defender utilities than two existing benchmarks in different parameter settings. Also, both PEDS and PEMS are more robust than the existing benchmarks considering uncertainties.
Computing Optimal Monitoring Strategy for Detecting Terrorist Plots
Wang, Zhen (Nanyang Technological University) | Yin, Yue (University of Chinese Academy of Sciences) | An, Bo (Nanyang Technological University)
In recent years, terrorist organizations (e.g., ISIS or al-Qaeda) are increasingly directing terrorists to launch coordinated attacks in their home countries. One example is the Paris shootings on January 7, 2015.By monitoring potential terrorists, security agencies are able to detect and stop terrorist plots at their planning stage.Although security agencies may have knowledge about potential terrorists (e.g., who they are, how they interact), they usually have limited resources and cannot monitor all terrorists.Moreover, a terrorist planner may strategically choose to arouse terrorists considering the security agency's monitoring strategy. This paper makes five key contributions toward the challenging problem of computing optimal monitoring strategies: 1) A new Stackelberg game model for terrorist plot detection;2) A modified double oracle framework for computing the optimal strategy effectively;3) Complexity results for both defender and attacker oracle problems;4) Novel mixed-integer linear programming (MILP) formulations for best response problems of both players;and 5) Effective approximation algorithms for generating suboptimal responses for both players.Experimental evaluation shows that our approach can obtain a robust enough solution outperforming widely-used centrality based heuristics significantly and scale up to realistic-sized problems.
Efficient Average Reward Reinforcement Learning Using Constant Shifting Values
Yang, Shangdong (Nanjing University) | Gao, Yang (Nanjing University) | An, Bo (Nanyang Technological University) | Wang, Hao (Nanjing University) | Chen, Xingguo (Nanjing University of Posts and Telecommunications)
There are two classes of average reward reinforcement learning (RL) algorithms: model-based ones that explicitly maintain MDP models and model-free ones that do not learn such models. Though model-free algorithms are known to be more efficient, they often cannot converge to optimal policies due to the perturbation of parameters. In this paper, a novel model-free algorithm is proposed, which makes use of constant shifting values (CSVs) estimated from prior knowledge. To encourage exploration during the learning process, the algorithm constantly subtracts the CSV from the rewards. A terminating condition is proposed to handle the unboundedness of Q-values caused by such substraction. The convergence of the proposed algorithm is proved under very mild assumptions. Furthermore, linear function approximation is investigated to generalize our method to handle large-scale tasks. Extensive experiments on representative MDPs and the popular game Tetris show that the proposed algorithms significantly outperform the state-of-the-art ones.