# Search

### Russian Security Service Searches Space Agency Over Suspected Treason: TASS

MOSCOW (Reuters) - Russia's Federal Security Service has searched a research facility controlled by the country's space agency Roskosmos over the suspected leaking of secrets about new hypersonic weapons to Western spies, the TASS news agency reported on Friday.

### Exponential Weights on the Hypercube in Polynomial Time

We study a general online linear optimization problem(OLO). At each round, a subset of objects from a fixed universe of $n$ objects is chosen, and a linear cost associated with the chosen subset is incurred. We use \textit{regret} as a measure of performance of our algorithms. Regret is the difference between the total cost incurred over all iterations and the cost of the best fixed subset in hindsight. We consider \textit{Full Information}, \textit{Semi-Bandit} and \textit{Bandit} feedback for this problem. Using characteristic vectors of the subsets, this problem reduces to OLO on the $\{0,1\}^n$ hypercube. The Exp2 algorithm and its bandit variants are commonly used strategies for this problem. It was previously unknown if it is possible to run Exp2 on the hypercube in polynomial time. In this paper, we present a polynomial time algorithm called \textit{PolyExp} for OLO on the hypercube. We show that our algorithm is equivalent to both Exp2 on $\{0,1\}^n$ as well as Online Mirror Descent(OMD) with Entropic regularization on $[0,1]^n$ and Bernoulli Sampling. Under $L_\infty$ adversarial losses, in the Full Information case and Semi-Bandit case, analyzing Exp2 directly, gives an expected regret bound of $O(n^{3/2}\sqrt{T})$, whereas PolyExp yields a regret of $O(n\sqrt{T})$. In the Bandit case, analyzing Exp2 directly, gives an expected regret bound of $O(n^{2}\sqrt{T})$, whereas PolyExp yields a regret of $O(n^{3/2}\sqrt{T})$. This implies an improvement on Exp2's regret bound for these settings because of the equivalence. Moreover, PolyExp is minimax optimal in all the three settings as its regret bounds match the $L_\infty$ lowerbounds in \cite{audibert2011minimax}. Finally, we show how to use PolyExp on the $\{-1,+1\}^n$ hypercube, solving an open problem in \cite{bubeck2012towards}.

### A Game-Based Approximate Verification of Deep Neural Networks with Provable Guarantees

Despite the improved accuracy of deep neural networks, the discovery of adversarial examples has raised serious safety concerns. In this paper, we study two variants of pointwise robustness, the maximum safe radius problem, which for a given input sample computes the minimum distance to an adversarial example, and the feature robustness problem, which aims to quantify the robustness of individual features to adversarial perturbations. We demonstrate that, under the assumption of Lipschitz continuity, both problems can be approximated using finite optimisation by discretising the input space, and the approximation has provable guarantees, i.e., the error is bounded. We then show that the resulting optimisation problems can be reduced to the solution of two-player turn-based games, where the first player selects features and the second perturbs the image within the feature. While the second player aims to minimise the distance to an adversarial example, depending on the optimisation objective the first player can be cooperative or competitive. We employ an anytime approach to solve the games, in the sense of approximating the value of a game by monotonically improving its upper and lower bounds. The Monte Carlo tree search algorithm is applied to compute upper bounds for both games, and the Admissible A* and the Alpha-Beta Pruning algorithms are, respectively, used to compute lower bounds for the maximum safety radius and feature robustness games. When working on the upper bound of the maximum safe radius problem, our tool demonstrates competitive performance against existing adversarial example crafting algorithms. Furthermore, we show how our framework can be deployed to evaluate pointwise robustness of neural networks in safety-critical applications such as traffic sign recognition in self-driving cars.

### Bootstrap Learning of Heuristic Functions

We investigate the use of machine learning to create effective heuristics for search algorithms such as IDA* or heuristicsearch planners.

### Automated Machine Learning Hyperparameter Tuning in Python

Tuning machine learning hyperparameters is a tedious yet crucial task, as the performance of an algorithm can be highly dependent on the choice of hyperparameters. Manual tuning takes time away from important steps of the machine learning pipeline like feature engineering and interpreting results. Grid and random search are hands-off, but require long run times because they waste time evaluating unpromising areas of the search space. Increasingly, hyperparameter tuning is done by automated methods that aim to find optimal hyperparameters in less time using an informed search with no manual effort necessary beyond the initial set-up. Bayesian optimization, a model-based method for finding the minimum of a function, has recently been applied to machine learning hyperparameter tuning, with results suggesting this approach can achieve better performance on the test set while requiring fewer iterations than random search.

### An AI System Taught Itself How to Solve the Rubik's Cube in Just 44 Hours

A self-taught artificial intelligence (AI) system called DeepCube has mastered solving the Rubik's Cube puzzle in just 44 hours without human intervention. The system's inventors have detailed their design in a paper titled'Solving the Rubik's Cube Without Human Knowledge'. "A generally intelligent agent must be able to teach itself how to solve problems in complex domains with minimal human supervision," write the paper's authors. "Indeed, if we're ever going to achieve a general, human-like machine intelligence, we'll have to develop systems that can learn and then apply those learnings to real-world applications." While many AI systems have been taught to play games, mastering the complexity of a Rubik's Cube posed a unique set of challenges.

### AI in Game Playing: Sokoban Solver

Artificial Intelligence is becoming instrumental in a variety of applications. Games serve as a good breeding ground for trying and testing these algorithms in a sandbox with simpler constraints in comparison to real life. In this project, we aim to develop an AI agent that can solve the classical Japanese game of Sokoban using various algorithms and heuristics and compare their performances through standard metrics.

### The Art of Drafting: A Team-Oriented Hero Recommendation System for Multiplayer Online Battle Arena Games

Multiplayer Online Battle Arena (MOBA) games have received increasing popularity recently. In a match of such games, players compete in two teams of five, each controlling an in-game avatars, known as heroes, selected from a roster of more than 100. The selection of heroes, also known as pick or draft, takes place before the match starts and alternates between the two teams until each player has selected one hero. Heroes are designed with different strengths and weaknesses to promote team cooperation in a game. Intuitively, heroes in a strong team should complement each other's strengths and suppressing those of opponents. Hero drafting is therefore a challenging problem due to the complex hero-to-hero relationships to consider. In this paper, we propose a novel hero recommendation system that suggests heroes to add to an existing team while maximizing the team's prospect for victory. To that end, we model the drafting between two teams as a combinatorial game and use Monte Carlo Tree Search (MCTS) for estimating the values of hero combinations. Our empirical evaluation shows that hero teams drafted by our recommendation algorithm have significantly higher win rate against teams constructed by other baseline and state-of-the-art strategies.

### Machine taught itself to solve Rubik's Cube without human help, UC Irvine researchers say

Two algorithms, collectively called Deep Cube, typically can solve the 3-D combination puzzle within 30 moves, which is less than or equal to systems that use human knowledge, according to the research paper. Less than 5.8% of the world's population can solve the Rubik's Cube, according to the Rubik's website.

### An Improved Generic Bet-and-Run Strategy for Speeding Up Stochastic Local Search

A commonly used strategy for improving optimization algorithms is to restart the algorithm when it is believed to be trapped in an inferior part of the search space. Building on the recent success of Bet-and-Run approaches for restarted local search solvers, we introduce an improved generic Bet-and-Run strategy. The goal is to obtain the best possible results within a given time budget t using a given black-box optimization algorithm. If no prior knowledge about problem features and algorithm behavior is available, the question about how to use the time budget most efficiently arises. We propose to first start k>=1 independent runs of the algorithm during an initialization budget t1=0 time units in doing so), and then continuing these runs for the remaining t3=t-t1-t2 time units. In previous Bet-and-Run strategies, the decision maker D=currentBest would simply select the run with the best- so-far results at negligible time. We propose using more advanced methods to discriminate between "good" and "bad" sample runs, with the goal of increasing the correlation of the chosen run with the a-posteriori best one. We test several different approaches, including neural networks trained or polynomials fitted on the current trace of the algorithm to predict which run may yield the best results if granted the remaining budget. We show with extensive experiments that this approach can yield better results than the previous methods, but also find that the currentBest method is a very reliable and robust baseline approach.