Goto

Collaborating Authors

Bosansky, Branislav


Discovering Imperfectly Observable Adversarial Actions using Anomaly Detection

arXiv.org Machine Learning

Anomaly detection is a method for discovering unusual and suspicious behavior. In many real-world scenarios, the examined events can be directly linked to the actions of an adversary, such as attacks on computer networks or frauds in financial operations. While the defender wants to discover such malicious behavior, the attacker seeks to accomplish their goal (e.g., exfiltrating data) while avoiding the detection. To this end, anomaly detectors have been used in a game-theoretic framework that captures these goals of a two-player competition. We extend the existing models to more realistic settings by (1) allowing both players to have continuous action spaces and by assuming that (2) the defender cannot perfectly observe the action of the attacker. We propose two algorithms for solving such games -- a direct extension of existing algorithms based on discretizing the feature space and linear programming and the second algorithm based on constrained learning. Experiments show that both algorithms are applicable for cases with low feature space dimensions but the learning-based method produces less exploitable strategies and it is scalable to higher dimensions. Moreover, we use real-world data to compare our approaches with existing classifiers in a data-exfiltration scenario via the DNS channel. The results show that our models are significantly less exploitable by an informed attacker.


Convergence of Monte Carlo Tree Search in Simultaneous Move Games

Neural Information Processing Systems

In this paper, we study Monte Carlo tree search (MCTS) in zero-sum extensive-form games with perfect information and simultaneous moves. We present a general template of MCTS algorithms for these games, which can be instantiated by various selection methods. We formally prove that if a selection method is $\epsilon$-Hannan consistent in a matrix game and satisfies additional requirements on exploration, then the MCTS algorithm eventually converges to an approximate Nash equilibrium (NE) of the extensive-form game. We empirically evaluate this claim using regret matching and Exp3 as the selection methods on randomly generated and worst case games. We confirm the formal result and show that additional MCTS variants also converge to approximate NE on the evaluated games.


Strategic Information Revelation and Commitment in Security Games

AAAI Conferences

The Strong Stackelberg Equilibrium (SSE) has drawn extensive attention recently in several security domains, which optimizes the defender's random allocation of limited security resources. However, the SSE concept neglects the advantage of defender's strategic revelation of her private information, and overestimates the observation ability of the adversaries. In this paper, we overcome these restrictions and analyze the tradeoff between strategic secrecy and commitment in security games. We propose a Disguised-resource Security Game (DSG) where the defender strategically disguises some of her resources. We compare strategic information revelation with public commitment and formally show that they have different advantages depending the payoff structure. To compute the Perfect Bayesian Equilibrium (PBE), several novel approaches are provided, including basic MILP formulations with mixed defender strategy and compact representation, a novel algorithm based on support set enumeration, and an approximation algorithm for epsilon-PBE. Extensive experimental evaluation shows that both strategic secrecy and Stackelberg commitment are critical measures in security domain, and our approaches can solve PBE for realistic-sized problems with good enough and robust solution quality.


Towards Solving Imperfect Recall Games

AAAI Conferences

Imperfect recall games represent dynamic interactions where players can forget previously known information, such as the exact history of played actions. Opposed to perfect recall games, where the players remember all information, imperfect recall games allow a concise representation of strategies. However, most of the existing algorithmic results are negative for imperfect recall games and many theoretical and computational results do not translate from perfect recall games. The goal of this paper is to (1) summarize the existing results regarding imperfect recall games and (2) extend these results to a restricted subclass of imperfect recall games termed A-loss recall games. Finally, (3) we emphasize the impact of these theoretical results on algorithms that compute (approximately) optimal strategies in extensive-form games and show that they cannot be easily extended to imperfect recall games.


Combining Incremental Strategy Generation and Branch and Bound Search for Computing Maxmin Strategies in Imperfect Recall Games

AAAI Conferences

Extensive-form games with imperfect recall are an important model of dynamic games where the players forget previously known information. Often, imperfect recall games are the result of an abstraction algorithm that simplifies a large game with perfect recall. Unfortunately, solving an imperfect recall game has fundamental problems since a Nash equilibrium does not have to exist. Alternatively, we can seek maxmin strategies that guarantee an expected outcome. The only existing algorithm computing maxmin strategies in imperfect recall games, however, requires approximating a bilinear program that is proportional to the size of the game and thus has a limited scalability. We propose a novel algorithm for computing maxmin strategies that combines this approximate algorithm with an incremental strategy-generation technique designed previously for extensive-form games with perfect recall. Experimental evaluation shows that the novel algorithm builds only a fraction of the game tree and improves the scalability by several orders of magnitude. Finally, we demonstrate that our algorithm can solve an abstracted variant of a large game faster compared to the algorithms operating on the unabstracted perfect-recall variant.


Cermak

AAAI Conferences

Strong Stackelberg Equilibrium (SSE) is a fundamental solution concept in game theory in which one player commits to a strategy, while the other player observes this commitment and plays a best response. We present a new algorithm for computing SSE for two-player extensive-form general-sum games with imperfect information (EFGs) where computing SSE is an NP-hard problem. Our algorithm is based on a correlated version of SSE, known as Stackelberg Extensive-Form Correlated Equilibrium (SEFCE). Our contribution is therefore twofold: (1) we give the first linear program for computing SEFCE in EFGs without chance, (2) we repeatedly solve and modify this linear program in a systematic search until we arrive to SSE. Our new algorithm outperforms the best previous algorithms by several orders of magnitude.


Using Correlated Strategies for Computing Stackelberg Equilibria in Extensive-Form Games

AAAI Conferences

Strong Stackelberg Equilibrium (SSE) is a fundamental solution concept in game theory in which one player commits to a strategy, while the other player observes this commitment and plays a best response. We present a new algorithm for computing SSE for two-player extensive-form general-sum games with imperfect information (EFGs) where computing SSE is an NP-hard problem. Our algorithm is based on a correlated version of SSE, known as Stackelberg Extensive-Form Correlated Equilibrium (SEFCE). Our contribution is therefore twofold: (1) we give the first linear program for computing SEFCE in EFGs without chance, (2) we repeatedly solve and modify this linear program in a systematic search until we arrive to SSE. Our new algorithm outperforms the best previous algorithms by several orders of magnitude.


Sequence-Form Algorithm for Computing Stackelberg Equilibria in Extensive-Form Games

AAAI Conferences

Stackelberg equilibrium is a solution concept prescribing for a player an optimal strategy to commit to, assuming the opponent knows this commitment and plays the best response. Although this solution concept is a cornerstone of many security applications, the existing works typically do not consider situations where the players can observe and react to the actions of the opponent during the course of the game. We extend the existing algorithmic work to extensive-form games and introduce novel algorithm for computing Stackelberg equilibria that exploits the compact sequence-form representation of strategies. Our algorithm reduces the size of the linear programs from exponential in the baseline approach to linear in the size of the game tree. Experimental evaluation on randomly generated games and a security-inspired search game demonstrates significant improvement in the scalability compared to the baseline approach.


Combining Compact Representation and Incremental Generation in Large Games with Sequential Strategies

AAAI Conferences

Many search and security games played on a graph can be modeled as normal-form zero-sum games with strategies consisting of sequences of actions. The size of the strategy space provides a computational challenge when solving these games. This complexity is tackled either by using the compact representation of sequential strategies and linear programming, or by incremental strategy generation of iterative double-oracle methods. In this paper, we present novel hybrid of these two approaches: compact-strategy double-oracle (CS-DO) algorithm that combines the advantages of the compact representation with incremental strategy generation. We experimentally compare CS-DO with the standard approaches and analyze the impact of the size of the support on the performance of the algorithms. Results show that CS-DO dramatically improves the convergence rate in games with non-trivial support


Convergence of Monte Carlo Tree Search in Simultaneous Move Games

Neural Information Processing Systems

In this paper, we study Monte Carlo tree search (MCTS) in zero-sum extensive-form games with perfect information and simultaneous moves. We present a general template of MCTS algorithms for these games, which can be instantiated by various selection methods. We formally prove that if a selection method is $\epsilon$-Hannan consistent in a matrix game and satisfies additional requirements on exploration, then the MCTS algorithm eventually converges to an approximate Nash equilibrium (NE) of the extensive-form game. We empirically evaluate this claim using regret matching and Exp3 as the selection methods on randomly generated and worst case games. We confirm the formal result and show that additional MCTS variants also converge to approximate NE on the evaluated games.