# Search

### Linares López

In this paper we propose a new algorithm for solving general two-player turn-taking games that performs symbolic search utilizing binary decision diagrams (BDDs). It consists of two stages: First, it determines all breadth-first search (BFS) layers using forward search and omitting duplicate detection, next, the solving process operates in backward direction only within these BFS layers thereby partitioning all BDDs according to the layers the states reside in. We provide experimental results for selected games and compare to a previous approach. This comparison shows that in most cases the new algorithm outperforms the existing one in terms of runtime and used memory so that it can solve games that could not be solved before with a general approach.

### Pitt Researcher Uses Video Games to Unlock New Levels of AI

A University of Pennsylvania computer scientists designs algorithms that learn decision strategies in complex and uncertain environments, and tests them in the simulated environments of Multiplayer Online Battle Arena games. The University of Pittsburgh's Daniel Jiang has developed algorithms that learn decision strategies in complex and uncertain environments, and tests them on a genre of video games called Multiplayer Online Battle Arena (MOBA). MOBAs involve players controlling one of several "hero" characters in order to destroy opponents' bases while protecting their own. A successful algorithm for training a gameplay artificial intelligence system must overcome several challenges, like real-time decision making and long decision horizons. Jiang's team designed the algorithm to evaluate 41 pieces of information and output one of 22 different actions; the most successful player used the Monte Carlo tree search method to generate data, which was fed into a neural network.

### Theoretical Analysis of Adversarial Learning: A Minimax Approach

We propose a general theoretical method for analyzing the risk bound in the presence of adversaries. In particular, we try to fit the adversarial learning problem into the minimax framework. We first show that the original adversarial learning problem could be reduced to a minimax statistical learning problem by introducing a transport map between distributions. Then we prove a risk bound for this minimax problem in terms of covering numbers. In contrast to previous minimax bounds in \cite{lee,far}, our bound is informative when the radius of the ambiguity set is small. Our method could be applied to multi-class classification problems and commonly-used loss functions such as hinge loss and ramp loss. As two illustrative examples, we derive the adversarial risk bounds for kernel-SVM and deep neural networks. Our results indicate that a stronger adversary might have a negative impact on the complexity of the hypothesis class and the existence of margin could serve as a defense mechanism to counter adversarial attacks.

### Learning Beam Search Policies via Imitation Learning

Beam search is widely used for approximate decoding in structured prediction problems. Models often use a beam at test time but ignore its existence at train time, and therefore do not explicitly learn how to use the beam. We develop an unifying meta-algorithm for learning beam search policies using imitation learning. In our setting, the beam is part of the model, and not just an artifact of approximate decoding. Our meta-algorithm captures existing learning algorithms and suggests new ones. It also lets us show novel no-regret guarantees for learning beam search policies.

### P-MCGS: Parallel Monte Carlo Acyclic Graph Search

Recently, there have been great interests in Monte Carlo Tree Search (MCTS) in AI research. Although the sequential version of MCTS has been studied widely, its parallel counterpart still lacks systematic study. This leads us to the following questions: \emph{how to design efficient parallel MCTS (or more general cases) algorithms with rigorous theoretical guarantee? Is it possible to achieve linear speedup?} In this paper, we consider the search problem on a more general acyclic one-root graph (namely, Monte Carlo Graph Search (MCGS)), which generalizes MCTS. We develop a parallel algorithm (P-MCGS) to assign multiple workers to investigate appropriate leaf nodes simultaneously. Our analysis shows that P-MCGS algorithm achieves linear speedup and that the sample complexity is comparable to its sequential counterpart.

### Interruptible Algorithms for Multiproblem Solving

In this paper we address the problem of designing an interruptible system in a setting in which $n$ problem instances, all equally important, must be solved concurrently. The system involves scheduling executions of contract algorithms (which offer a trade-off between allowable computation time and quality of the solution) in m identical parallel processors. When an interruption occurs, the system must report a solution to each of the $n$ problem instances. The quality of this output is then compared to the best-possible algorithm that has foreknowledge of the interruption time and must, likewise, produce solutions to all $n$ problem instances. This extends the well-studied setting in which only one problem instance is queried at interruption time. In this work we first introduce new measures for evaluating the performance of interruptible systems in this setting. In particular, we propose the deficiency of a schedule as a performance measure that meets the requirements of the problem at hand. We then present a schedule whose performance we prove that is within a small factor from optimal in the general, multiprocessor setting. We also show several lower bounds on the deficiency of schedules on a single processor. More precisely, we prove a general lower bound of (n+1)/n, an improved lower bound for the two-problem setting (n=2), and a tight lower bound for the class of round-robin schedules. Our techniques can also yield a simpler, alternative proof of the main result of [Bernstein et al, IJCAI 2003] concerning the performance of cyclic schedules in multiprocessor environments.

### Evading classifiers in discrete domains with provable optimality guarantees

Security-critical applications such as malware, fraud, or spam detection, require machine learning models that operate on examples from constrained discrete domains. In these settings, gradient-based attacks that rely on adding perturbations often fail to produce adversarial examples that meet the domain constraints, and thus are not effective. We introduce a graphical framework that (1) formalizes existing attacks in discrete domains, (2) efficiently produces valid adversarial examples with guarantees of minimal cost, and (3) can accommodate complex cost functions beyond the commonly used p-norm. We demonstrate the effectiveness of this method by crafting adversarial examples that evade a Twitter bot detection classifier using a provably minimal number of changes.

### Heuristic Search Planning With Multi-Objective Probabilistic LTL Constraints

We present an algorithm for computing cost-optimal stochastic policies for Stochastic Shortest Path problems (SSPs) subject to multi-objective PLTL constraints, i.e., conjunctions of probabilistic LTL formulas. Established algorithms capable of solving this problem typically stem from the area of probabilistic verification, and struggle with the large state spaces and constraint types found in automated planning. Our approach differs in two crucial ways. Firstly it operates entirely on-the-fly, bypassing the expensive construction of Rabin automata for the formulas and their prohibitive prior synchronisation with the full state space of the SSP. Secondly, it extends recent heuristic search algorithms and admissible heuristics for cost-constrained SSPs, to enable pruning regions made infeasible by the PLTL constraints. We prove our algorithm correct and optimal, and demonstrate encouraging scalability results.

### Combinatorial Optimization with Graph Convolutional Networks and Guided Tree Search

We present a learning-based approach to computing solutions for certain NP-hard problems. Our approach combines deep learning techniques with useful algorithmic elements from classic heuristics. The central component is a graph convolutional network that is trained to estimate the likelihood, for each vertex in a graph, of whether this vertex is part of the optimal solution. The network is designed and trained to synthesize a diverse set of solutions, which enables rapid exploration of the solution space via tree search. The presented approach is evaluated on four canonical NP-hard problems and five datasets, which include benchmark satisfiability problems and real social network graphs with up to a hundred thousand nodes. Experimental results demonstrate that the presented approach substantially outperforms recent deep learning work, and performs on par with highly optimized state-of-the-art heuristic solvers for some NP-hard problems. Experiments indicate that our approach generalizes across datasets, and scales to graphs that are orders of magnitude larger than those used during training.

### Noisy Blackbox Optimization with Multi-Fidelity Queries: A Tree Search Approach

We study the problem of black-box optimization of a noisy function in the presence of low-cost approximations or fidelities, which is motivated by problems like hyper-parameter tuning. In hyper-parameter tuning evaluating the black-box function at a point involves training a learning algorithm on a large data-set at a particular hyper-parameter and evaluating the validation error. Even a single such evaluation can be prohibitively expensive. Therefore, it is beneficial to use low-cost approximations, like training the learning algorithm on a sub-sampled version of the whole data-set. These low-cost approximations/fidelities can however provide a biased and noisy estimate of the function value. In this work, we incorporate the multi-fidelity setup in the powerful framework of noisy black-box optimization through tree-like hierarchical partitions. We propose a multi-fidelity bandit based tree-search algorithm for the problem and provide simple regret bounds for our algorithm. Finally, we validate the performance of our algorithm on real and synthetic datasets, where it outperforms several benchmarks.