Goto

Collaborating Authors

 Duke University


Beat the Cheater: Computing Game-Theoretic Strategies for When to Kick a Gambler out of a Casino

AAAI Conferences

Gambles in casinos are usually set up so that the casino makes a profit in expectation -- as long as gamblers play honestly. However, some gamblers are able to cheat, reducing the casino’s profit. How should the casino address this? A common strategy is to selectively kick gamblers out, possibly even without being sure that they were cheating. In this paper, we address the following question: Based solely on a gambler’s track record,when is it optimal for the casino to kick the gambler out? Because cheaters will adapt to the casino’s policy, this is a game-theoretic question. Specifically, we model the problem as a Bayesian game in which the casino is a Stackelberg leader that can commit to a (possibly randomized) policy for when to kick gamblers out, and we provide efficient algorithms for computing the optimal policy. Besides being potentially useful to casinos, we imagine that similar techniques could be useful for addressing related problems -- for example, illegal trades in financial markets.


On the Axiomatic Characterization of Runoff Voting Rules

AAAI Conferences

Runoff voting rules such as single transferable vote (STV) and Baldwin's rule are of particular interest in computational social choice due to their recursive nature and hardness of manipulation, as well as in (human) practice because they are relatively easy to understand. However, they are not known for their compliance with desirable axiomatic properties, which we attempt to rectify here. We characterize runoff rules that are based on scoring rules using two axioms: a weakening of local independence of irrelevant alternatives and a variant of population-consistency. We then show, as our main technical result, that STV is the only runoff scoring rule satisfying an independence-of-clones property. Furthermore, we provide axiomatizations of Baldwin's rule and Coombs' rule.


Mechanism Design for Scheduling with Uncertain Execution Time

AAAI Conferences

We study the problem where a task (or multiple unrelated tasks) must be executed, there are multiple machines/agents that can potentially perform the task, and our objective is to minimize the expected sum of the agents' processing times. Each agent does not know exactly how long it will take him to finish the task; he only knows the distribution from which this time is drawn. These times are independent across agents and the distributions fulfill the monotone hazard rate condition. Agents are selfish and will lie about their distributions if this increases their expected utility. We study different variations of the Vickrey mechanism that take as input the agents' reported distributions and the players' realized running times and that output a schedule that minimizes the expected sum of processing times, as well as payments that make it an ex-post equilibrium for the agents to both truthfully report their distributions and exert full effort to complete the task. We devise the ChPE mechanism, which is uniquely tailored to our problem, and has many desirable properties including: not rewarding agents that fail to finish the task and having non-negative payments.


Solving Security Games on Graphs via Marginal Probabilities

AAAI Conferences

Security games involving the allocation of multiple security resources to defend multiple targets generally have an exponential number of pure strategies for the defender. One method that has been successful in addressing this computational issue is to instead directly compute the marginal probabilities with which the individual resources are assigned (first pursued by Kiekintveld et al. (2009)). However, in sufficiently general settings, there exist games where these marginal solutions are not implementable, that is, they do not correspond to any mixed strategy of the defender. In this paper, we examine security games where the defender tries to monitor the vertices of a graph, and we show how the type of graph, the type of schedules, and the type of defender resources affect the applicability of this approach. In some settings, we show the approach is applicable and give a polynomial-time algorithm for computing an optimal defender strategy; in other settings, we give counterexample games that demonstrate that the approach does not work, and prove NP-hardness results for computing an optimal defender strategy.


PAC Optimal Exploration in Continuous Space Markov Decision Processes

AAAI Conferences

Current exploration algorithms can be classified in two broad categories: Heuristic, and PAC optimal. While numerous researchers have used heuristic approaches such as epsilon-greedy exploration successfully, such approaches lack formal, finite sample guarantees and may need a significant amount of fine-tuning to produce good results. PAC optimal exploration algorithms, on the other hand, offer strong theoretical guarantees but are inapplicable in domains of realistic size. The goal of this paper is to bridge the gap between theory and practice, by introducing C-PACE, an algorithm which offers strong theoretical guarantees and can be applied to interesting, continuous space problems.


Sample Complexity and Performance Bounds for Non-Parametric Approximate Linear Programming

AAAI Conferences

One of the most difficult tasks in value function approximation for Markov Decision Processes is finding an approximation architecture that is expressive enough to capture the important structure in the value function, while at the same time not overfitting the training samples. Recent results in non-parametric approximate linear programming (NP-ALP), have demonstrated that this can be done effectively using nothing more than a smoothness assumption on the value function. In this paper we extend these results to the case where samples come from real world transitions instead of the full Bellman equation, adding robustness to noise. In addition, we provide the first max-norm, finite sample performance guarantees for any form of ALP. NP-ALP is amenable to problems with large (multidimensional) or even infinite (continuous) action spaces, and does not require a model to select actions using the resulting approximate solution.


Fast Equilibrium Computation for Infinitely Repeated Games

AAAI Conferences

It is known that an equilibrium of an infinitely repeated two-player game (with limit average payoffs) can be computed in polynomial time, as follows: according to the folk theorem, we compute minimax strategies for both players to calculate the punishment values, and subsequently find a mixture over outcomes that exceeds these punishment values. However, for very large games, even computing minimax strategies can be prohibitive. In this paper, we propose an algorithmic framework for computing equilibria of repeated games that does not require linear programming and that does not necessarily need to inspect all payoffs of the game. This algorithm necessarily sometimes fails to compute an equilibrium, but we mathematically demonstrate that most of the time it succeeds quickly on uniformly random games, and experimentally demonstrate this for other classes of games. This also holds for games with more than two players, for which no efficient general algorithms are known.


Computing Optimal Strategies to Commit to in Stochastic Games

AAAI Conferences

Significant progress has been made recently in the following two lines of research in the intersection of AI and game theory: (1) the computation of optimal strategies to commit to (Stackelberg strategies), and (2) the computation of correlated equilibria of stochastic games. In this paper, we unite these two lines of research by studying the computation of Stackelberg strategies in stochastic games. We provide theoretical results on the value of being able to commit and the value of being able to correlate, as well as complexity results about computing Stackelberg strategies in stochastic games. We then modify the QPACE algorithm (MacDermed et al. 2011) to compute Stackelberg strategies, and provide experimental results.


Computing Game-Theoretic Solutions and Applications to Security

AAAI Conferences

The multiagent systems community has adopted game theory as a framework for the design of systems of multiple self-interested agents. For this to be effective, efficient algorithms must be designed to compute the solutions that game theory prescribes. In this paper, I summarize some of the state of the art on this topic, focusing particularly on how this line of work has contributed to several highly visible deployed security applications, developed at the University of Southern California.


Evaluating Resistance to False-Name Manipulations in Elections

AAAI Conferences

In many mechanisms (especially online mechanisms), a strategic agent can influence the outcome by creating multiple false identities. We consider voting settings where the mechanism designer cannot completely prevent false-name manipulation, but may use false-name-limiting methods such as CAPTCHAs to influence the amount and characteristics of such manipulation. Such a designer would prefer, first, a high probability of obtaining the “correct” outcome, and second, a statistical method for evaluating the correctness of the outcome. In this paper, we focus on settings with two alternatives. We model voters as independently drawing a number of identities from a distribution that may be influenced by the choice of the false-name-limiting method. We give a criterion for the evaluation and comparison of these distributions. Then, given the results of an election in which false-name manipulation may have occurred, we propose and justify a statistical test for evaluating the outcome.