Goto

Collaborating Authors

 bilinear program





Computing Optimal Nash Equilibria in Multiplayer Games

Neural Information Processing Systems

There are other approaches (e.g., [ Here, if all team members play strategies according to an NE minimizing the adversary's utility, the Eq.(1c) ensures that binary variable This space is represented by Eq.(1), which involves nonlinear terms in Eq.(1a) Section 3.4 shows that our techniques can significantly reduce the time The procedure of CRM is shown in Algorithm 2, which is illustrated in Appendix A. A collection N of subsets of players is a binary collection if: 1. { i | i N } N ; Eqs.(1b)-(1g), (3), and (4) is the space of NEs. Example 1 provides an example of N .



Strategic Concealment of Environment Representations in Competitive Games

arXiv.org Artificial Intelligence

This paper investigates the strategic concealment of environment representations used by players in competitive games. We consider a defense scenario in which one player (the Defender) seeks to infer and exploit the representation used by the other player (the Attacker). The interaction between the two players is modeled as a Bayesian game: the Defender infers the Attacker's representation from its trajectory and places barriers to obstruct the Attacker's path towards its goal, while the Attacker obfuscates its representation type to mislead the Defender. We solve for the Perfect Bayesian Nash Equilibrium via a bilinear program that integrates Bayesian inference, strategic planning, and belief manipulation. Simulations show that purposeful concealment naturally emerges: the Attacker randomizes its trajectory to manipulate the Defender's belief, inducing suboptimal barrier selections and thereby gaining a strategic advantage.


Formal Verification of Markov Processes with Learned Parameters

arXiv.org Artificial Intelligence

We introduce the problem of formally verifying properties of Markov processes where the parameters are the output of machine learning models. Our formulation is general and solves a wide range of problems, including verifying properties of probabilistic programs that use machine learning, and subgroup analysis in healthcare modeling. We show that for a broad class of machine learning models, including linear models, tree-based models, and neural networks, verifying properties of Markov chains like reachability, hitting time, and total reward can be formulated as a bilinear program. We develop a decomposition and bound propagation scheme for solving the bilinear program and show through computational experiments that our method solves the problem to global optimality up to 100x faster than state-of-the-art solvers. We also release $\texttt{markovml}$, an open-source tool for building Markov processes, integrating pretrained machine learning models, and verifying their properties, available at https://github.com/mmaaz-git/markovml.


A New Theory for Matrix Completion

Neural Information Processing Systems

Prevalent matrix completion theories reply on an assumption that the locations of the missing data are distributed uniformly and randomly (i.e., uniform sampling). Nevertheless, the reason for observations being missing often depends on the unseen observations themselves, and thus the missing data in practice usually occurs in a nonuniform and deterministic fashion rather than randomly. To break through the limits of random sampling, this paper introduces a new hypothesis called isomeric condition, which is provably weaker than the assumption of uniform sampling and arguably holds even when the missing data is placed irregularly. Equipped with this new tool, we prove a series of theorems for missing data recovery and matrix completion. In particular, we prove that the exact solutions that identify the target matrix are included as critical points by the commonly used nonconvex programs. Unlike the existing theories for nonconvex matrix completion, which are built upon the same condition as convex programs, our theory shows that nonconvex programs have the potential to work with a much weaker condition. Comparing to the existing studies on nonuniform sampling, our setup is more general.


Clustering via Concave Minimization

Neural Information Processing Systems

The problem of assigning m points in the n-dimensional real space Rn to k clusters is formulated as that of determining k centers in Rn such that the sum of distances of each point to the nearest center is minimized. If a polyhedral distance is used, the problem can be formulated as that of minimizing a piecewise-linear concave function on a polyhedral set which is shown to be equivalent to a bilinear program: minimizing a bilinear function on a polyhe(cid:173) dral set. A fast finite k-Median Algorithm consisting of solving few linear programs in closed form leads to a stationary point of the bilinear program. Computational testing on a number of real(cid:173) world databases was carried out. On the Wisconsin Diagnostic Breast Cancer (WDBC) database, k-Median training set correct(cid:173) ness was comparable to that of the k-Mean Algorithm, however its testing set correctness was better.


Optimistic Whittle Index Policy: Online Learning for Restless Bandits

arXiv.org Artificial Intelligence

Restless multi-armed bandits (RMABs) extend multi-armed bandits to allow for stateful arms, where the state of each arm evolves restlessly with different transitions depending on whether that arm is pulled. Solving RMABs requires information on transition dynamics, which are often unknown upfront. To plan in RMAB settings with unknown transitions, we propose the first online learning algorithm based on the Whittle index policy, using an upper confidence bound (UCB) approach to learn transition dynamics. Specifically, we estimate confidence bounds of the transition probabilities and formulate a bilinear program to compute optimistic Whittle indices using these estimates. Our algorithm, UCWhittle, achieves sublinear $O(H \sqrt{T \log T})$ frequentist regret to solve RMABs with unknown transitions in $T$ episodes with a constant horizon $H$. Empirically, we demonstrate that UCWhittle leverages the structure of RMABs and the Whittle index policy solution to achieve better performance than existing online learning baselines across three domains, including one constructed from a real-world maternal and childcare dataset.