Search
Greed is Good: Exploration and Exploitation Trade-offs in Bayesian Optimisation
De Ath, George, Everson, Richard M., Rahat, Alma A. M., Fieldsend, Jonathan E.
The performance of acquisition functions for Bayesian optimisation is investigated in terms of the Pareto front between exploration and exploitation. We show that Expected Improvement and the Upper Confidence Bound always select solutions to be expensively evaluated on the Pareto front, but Probability of Improvement is never guaranteed to do so and Weighted Expected Improvement does only for a restricted range of weights. We introduce two novel $\epsilon$-greedy acquisition functions. Extensive empirical evaluation of these together with random search, purely exploratory and purely exploitative search on 10 benchmark problems in 1 to 10 dimensions shows that $\epsilon$-greedy algorithms are generally at least as effective as conventional acquisition functions, particularly with a limited budget. In higher dimensions $\epsilon$-greedy approaches are shown to have improved performance over conventional approaches. These results are borne out on a real world computational fluid dynamics optimisation problem and a robotics active learning problem.
Augmented Random Search for Quadcopter Control: An alternative to Reinforcement Learning
Tiwari, Ashutosh Kumar, Nadimpalli, Sandeep Varma
Model-based reinforcement learning strategies are believed to exhibit more significant sample complexity than model-free strategies to control dynamical systems,such as quadcopters.This belief that Model-based strategies that involve the use of well-trained neural networks for making such high-level decisions always give better performance can be dispelled by making use of Model-free policy search methods.This paper proposes the use of a model-free random searching strategy,called Augmented Random Search(ARS),which is a better and faster approach of linear policy training for continuous control tasks like controlling a Quadcopters flight.The method achieves state-of-the-art accuracy by eliminating the use of too much data for the training of neural networks that are present in the previous approaches to the task of Quadcopter control.The paper also highlights the performance results of the searching strategy used for this task in a strategically designed task environment with the help of simulations.Reward collection performance over 1000 episodes and agents behavior in flight for augmented random search is compared with that of the behavior for reinforcement learning state-of-the-art algorithm,called Deep Deterministic policy gradient(DDPG).Our simulations and results manifest that a high variability in performance is observed in commonly used strategies for sample efficiency of such tasks but the built policy network of ARS-Quad can react relatively accurately to step response providing a better performing alternative to reinforcement learning strategies.
Learning Neural Search Policies for Classical Planning
Gomoluch, Pawel, Alrajeh, Dalal, Russo, Alessandra, Bucchiarone, Antonio
Heuristic forward search is currently the dominant paradigm in classical planning. Forward search algorithms typically rely on a single, relatively simple variation of best-first search and remain fixed throughout the process of solving a planning problem. Existing work combining multiple search techniques usually aims at supporting best-first search with an additional exploratory mechanism, triggered using a handcrafted criterion. A notable exception is very recent work which combines various search techniques using a trainable policy. It is, however, confined to a discrete action space comprising several fixed subroutines. In this paper, we introduce a parametrized search algorithm template which combines various search techniques within a single routine. The template's parameter space defines an infinite space of search algorithms, including, among others, BFS, local and random search. We further introduce a neural architecture for designating the values of the search parameters given the state of the search. This enables expressing neural search policies that change the values of the parameters as the search progresses. The policies can be learned automatically, with the objective of maximizing the planner's performance on a given distribution of planning problems. We consider a training setting based on a stochastic optimization algorithm known as the cross-entropy method (CEM). Experimental evaluation of our approach shows that it is capable of finding effective distribution-specific search policies, outperforming the relevant baselines.
Fair DARTS: Eliminating Unfair Advantages in Differentiable Architecture Search
Chu, Xiangxiang, Zhou, Tianbao, Zhang, Bo, Li, Jixiang
Differential Architecture Search (DARTS) is now a widely disseminated weight-sharing neural architecture search method. However, there are two fundamental weaknesses remain untackled. First, we observe that the well-known aggregation of skip connections during optimization is caused by an unfair advantage in an exclusive competition. Second, there is a non-negligible incongruence when discretizing continuous architectural weights to a one-hot representation. Because of these two reasons, DARTS delivers a biased solution that might not even be suboptimal. In this paper, we present a novel approach to curing both frailties. Specifically, as unfair advantages in a pure exclusive competition easily induce a monopoly, we relax the choice of operations to be collaborative, where we let each operation have an equal opportunity to develop its strength. We thus call our method Fair DARTS. Moreover, we propose a zero-one loss to directly reduce the discretization gap. Experiments are performed on two mainstream search spaces, in which we achieve new state-of-the-art networks on ImageNet. Our code is available on https://github.com/xiaomi-automl/fairdarts.
Trading Convergence Rate with Computational Budget in High Dimensional Bayesian Optimization
Tran-The, Hung, Gupta, Sunil, Rana, Santu, Venkatesh, Svetha
Scaling Bayesian optimisation (BO) to high-dimensional search spaces is a active and open research problems particularly when no assumptions are made on function structure. The main reason is that at each iteration, BO requires to find global maximisation of acquisition function, which itself is a non-convex optimization problem in the original search space. With growing dimensions, the computational budget for this maximisation gets increasingly short leading to inaccurate solution of the maximisation. This inaccuracy adversely affects both the convergence and the efficiency of BO. We propose a novel approach where the acquisition function only requires maximisation on a discrete set of low dimensional subspaces embedded in the original high-dimensional search space. Our method is free of any low dimensional structure assumption on the function unlike many recent high-dimensional BO methods. Optimising acquisition function in low dimensional subspaces allows our method to obtain accurate solutions within limited computational budget. We show that in spite of this convenience, our algorithm remains convergent. In particular, cumulative regret of our algorithm only grows sub-linearly with the number of iterations. More importantly, as evident from our regret bounds, our algorithm provides a way to trade the convergence rate with the number of subspaces used in the optimisation. Finally, when the number of subspaces is "sufficiently large", our algorithm's cumulative regret is at most $\mathcal{O}^{*}(\sqrt{T\gamma_T})$ as opposed to $\mathcal{O}^{*}(\sqrt{DT\gamma_T})$ for the GP-UCB of Srinivas et al. (2012), reducing a crucial factor $\sqrt{D}$ where $D$ being the dimensional number of input space.
Minimax Optimal Algorithms for Adversarial Bandit Problem with Multiple Plays
Vural, N. Mert, Gokcesu, Hakan, Gokcesu, Kaan, Kozat, Suleyman S.
We investigate the adversarial bandit problem with multiple plays under semi-bandit feedback. We introduce a highly efficient algorithm that asymptotically achieves the performance of the best switching $m$-arm strategy with minimax optimal regret bounds. To construct our algorithm, we introduce a new expert advice algorithm for the multiple-play setting. By using our expert advice algorithm, we additionally improve the best-known high-probability bound for the multi-play setting by $O(\sqrt{m})$. Our results are guaranteed to hold in an individual sequence manner since we have no statistical assumption on the bandit arm gains. Through an extensive set of experiments involving synthetic and real data, we demonstrate significant performance gains achieved by the proposed algorithm with respect to the state-of-the-art algorithms.
On the Measure of Intelligence
To make deliberate progress towards more intelligent and more human-like artificial systems, we need to be following an appropriate feedback signal: we need to be able to define and evaluate intelligence in a way that enables comparisons between two systems, as well as comparisons with humans. Over the past hundred years, there has been an abundance of attempts to define and measure intelligence, across both the fields of psychology and AI. We summarize and critically assess these definitions and evaluation approaches, while making apparent the two historical conceptions of intelligence that have implicitly guided them. We note that in practice, the contemporary AI community still gravitates towards benchmarking intelligence by comparing the skill exhibited by AIs and humans at specific tasks such as board games and video games. We argue that solely measuring skill at any given task falls short of measuring intelligence, because skill is heavily modulated by prior knowledge and experience: unlimited priors or unlimited training data allow experimenters to "buy" arbitrary levels of skills for a system, in a way that masks the system's own generalization power. We then articulate a new formal definition of intelligence based on Algorithmic Information Theory, describing intelligence as skill-acquisition efficiency and highlighting the concepts of scope, generalization difficulty, priors, and experience. Using this definition, we propose a set of guidelines for what a general AI benchmark should look like. Finally, we present a benchmark closely following these guidelines, the Abstraction and Reasoning Corpus (ARC), built upon an explicit set of priors designed to be as close as possible to innate human priors. We argue that ARC can be used to measure a human-like form of general fluid intelligence and that it enables fair general intelligence comparisons between AI systems and humans.
Smart Predict-and-Optimize for Hard Combinatorial Optimization Problems
Mandi, Jaynta, Demiroviฤ, Emir, Stuckey, Peter. J, Guns, Tias
Combinatorial optimization assumes that all parameters of the optimization problem, e.g. the weights in the objective function, are fixed. Often, these weights are mere estimates and increasingly machine learning techniques are used to for their estimation. Recently, Smart Predict and Optimize (SPO) has been proposed for problems with a linear objective function over the predictions, more specifically linear programming problems. It takes the regret of the predictions on the linear problem into account, by repeatedly solving it during learning. We investigate the use of SPO to solve more realistic discrete optimization problems. The main challenge is the repeated solving of the optimization problem. To this end, we investigate ways to relax the problem as well as warm-starting the learning and the solving. Our results show that even for discrete problems it often suffices to train by solving the relaxation in the SPO loss. Furthermore, this approach outperforms the state-of-the-art approach of Wilder, Dilkina, and Tambe. We experiment with weighted knapsack problems as well as complex scheduling problems, and show for the first time that a predict-and-optimize approach can successfully be used on large-scale combinatorial optimization problems. Introduction Combinatorial optimization aims to optimize an objective function over a set of feasible solutions defined on a discrete space. Numerous real-life decision-making problems can be formulated as combinatorial optimization problems (Korte et al. 2012; Trevisan 2011).
Multi-objective Neural Architecture Search via Predictive Network Performance Optimization
Shi, Han, Pi, Renjie, Xu, Hang, Li, Zhenguo, Kwok, James T., Zhang, Tong
Neural Architecture Search (NAS) has shown great potentials in finding a better neural network design than human design. Sample-based NAS is the most fundamental method aiming at exploring the search space and evaluating the most promising architecture. However, few works have focused on improving the sampling efficiency for a multi-objective NAS. Inspired by the nature of the graph structure of a neural network, we propose BOGCN-NAS, a NAS algorithm using Bayesian Optimization with Graph Convolutional Network (GCN) predictor. Specifically, we apply GCN as a surrogate model to adaptively discover and incorporate nodes structure to approximate the performance of the architecture. Our method further considers an efficient multi-objective search which can be flexibly injected into any sample-based NAS pipelines to efficiently find the best speed/accuracy tradeoff. Extensive experiments are conducted to verify the effectiveness of our method over many competing methods, e.g. Recently Neural Architecture Search (NAS) has aroused a surge of interest by its potentials of freeing the researchers from tedious and time-consuming architecture tuning for each new task and dataset. Specifically, NAS has already shown some competitive results comparing with handcrafted architectures in computer vision: classification (Real et al., 2019b), detection, segmentation (Ghiasi et al., 2019; Chen et al., 2019; Liu et al., 2019a) and super-resolution (Chu et al., 2019). Meanwhile, NAS has also achieved remarkable results in natural language processing tasks (Luong et al., 2018; So et al., 2019). A variety of search strategies have been proposed, which may be categorized into two groups: one-shot NAS algorithms (Liu et al., 2019b; Pham et al., 2018; Luo et al., 2018), and sample-based algorithms (Zoph & Le, 2017; Liu et al., 2018a; Real et al., 2019b).
Bayesian optimization with local search
Gao, Yuzhou, Yu, Tengchao, Li, Jinglai
Global optimization finds applications in a wide range of real world problems. The multi-start methods are a popular class of global optimization techniques, which are based on the ideas of conducting local searches at multiple starting points, and then sequentially determine the starting points according to some prescribed rules. In this work we propose a new multi-start algorithm where the starting points are determined in a Bayesian optimization framework. Specifically, the method can be understood as to construct a new function by conducting local searches of the original objective function, where the new function attains the same global optima as the original one. Bayesian optimization is then applied to find the global optima of the new local search based function.