Goto

Collaborating Authors

 Maiti, Arnab


Query Complexity of Tournament Solutions

arXiv.org Artificial Intelligence

A directed graph where there is exactly one edge between every pair of vertices is called a {\em tournament}. Finding the "best" set of vertices of a tournament is a well studied problem in social choice theory. A {\em tournament solution} takes a tournament as input and outputs a subset of vertices of the input tournament. However, in many applications, for example, choosing the best set of drugs from a given set of drugs, the edges of the tournament are given only implicitly and knowing the orientation of an edge is costly. In such scenarios, we would like to know the best set of vertices (according to some tournament solution) by "querying" as few edges as possible. We, in this paper, precisely study this problem for commonly used tournament solutions: given an oracle access to the edges of a tournament T, find $f(T)$ by querying as few edges as possible, for a tournament solution f. We first show that the set of Condorcet non-losers in a tournament can be found by querying $2n-\lfloor \log n \rfloor -2$ edges only and this is tight in the sense that every algorithm for finding the set of Condorcet non-losers needs to query at least $2n-\lfloor \log n \rfloor -2$ edges in the worst case, where $n$ is the number of vertices in the input tournament. We then move on to study other popular tournament solutions and show that any algorithm for finding the Copeland set, the Slater set, the Markov set, the bipartisan set, the uncovered set, the Banks set, and the top cycle must query $\Omega(n^2)$ edges in the worst case. On the positive side, we are able to circumvent our strong query complexity lower bound results by proving that, if the size of the top cycle of the input tournament is at most $k$, then we can find all the tournament solutions mentioned above by querying $O(nk + \frac{n\log n}{\log(1-\frac{1}{k})})$ edges only.


Near-Optimal Pure Exploration in Matrix Games: A Generalization of Stochastic Bandits & Dueling Bandits

arXiv.org Artificial Intelligence

We study the sample complexity of identifying the pure strategy Nash equilibrium (PSNE) in a two-player zero-sum matrix game with noise. Formally, we are given a stochastic model where any learner can sample an entry $(i,j)$ of the input matrix $A\in[-1,1]^{n\times m}$ and observe $A_{i,j}+\eta$ where $\eta$ is a zero-mean 1-sub-Gaussian noise. The aim of the learner is to identify the PSNE of $A$, whenever it exists, with high probability while taking as few samples as possible. Zhou et al. (2017) presents an instance-dependent sample complexity lower bound that depends only on the entries in the row and column in which the PSNE lies. We design a near-optimal algorithm whose sample complexity matches the lower bound, up to log factors. The problem of identifying the PSNE also generalizes the problem of pure exploration in stochastic multi-armed bandits and dueling bandits, and our result matches the optimal bounds, up to log factors, in both the settings.


Instance-dependent Sample Complexity Bounds for Zero-sum Matrix Games

arXiv.org Artificial Intelligence

We study the sample complexity of identifying an approximate equilibrium for two-player zero-sum $n\times 2$ matrix games. That is, in a sequence of repeated game plays, how many rounds must the two players play before reaching an approximate equilibrium (e.g., Nash)? We derive instance-dependent bounds that define an ordering over game matrices that captures the intuition that the dynamics of some games converge faster than others. Specifically, we consider a stochastic observation model such that when the two players choose actions $i$ and $j$, respectively, they both observe each other's played actions and a stochastic observation $X_{ij}$ such that $\mathbb E[ X_{ij}] = A_{ij}$. To our knowledge, our work is the first case of instance-dependent lower bounds on the number of rounds the players must play before reaching an approximate equilibrium in the sense that the number of rounds depends on the specific properties of the game matrix $A$ as well as the desired accuracy. We also prove a converse statement: there exist player strategies that achieve this lower bound.


Streaming Algorithms for Stochastic Multi-armed Bandits

arXiv.org Machine Learning

We study the Stochastic Multi-armed Bandit problem under bounded arm-memory. In this setting, the arms arrive in a stream, and the number of arms that can be stored in the memory at any time, is bounded. The decision-maker can only pull arms that are present in the memory. We address the problem from the perspective of two standard objectives: 1) regret minimization, and 2) best-arm identification. For regret minimization, we settle an important open question by showing an almost tight hardness. We show {\Omega}(T^{2/3}) cumulative regret in expectation for arm-memory size of (n-1), where n is the number of arms. For best-arm identification, we study two algorithms. First, we present an O(r) arm-memory r-round adaptive streaming algorithm to find an {\epsilon}-best arm. In r-round adaptive streaming algorithm for best-arm identification, the arm pulls in each round are decided based on the observed outcomes in the earlier rounds. The best-arm is the output at the end of r rounds. The upper bound on the sample complexity of our algorithm matches with the lower bound for any r-round adaptive streaming algorithm. Secondly, we present a heuristic to find the {\epsilon}-best arm with optimal sample complexity, by storing only one extra arm in the memory.