Goto

Collaborating Authors

 Optimization


A Multi-criteria Approach for Fast and Outlier-aware Representative Selection from Manifolds

arXiv.org Machine Learning

The problem of representative selection amounts to sampling few informative exemplars from large datasets. This paper presents MOSAIC, a novel representative selection approach from high-dimensional data that may exhibit non-linear structures. Resting upon a novel quadratic formulation, Our method advances a multi-criteria selection approach that maximizes the global representation power of the sampled subset, ensures diversity, and rejects disruptive information by effectively detecting outliers. Through theoretical analyses we characterize the obtained sketch and reveal that the sampled representatives maximize a well-defined notion of data coverage in a transformed space. In addition, we present a highly scalable randomized implementation of the proposed algorithm shown to bring about substantial speedups. MOSAIC's superiority in achieving the desired characteristics of a representative subset all at once while exhibiting remarkable robustness to various outlier types is demonstrated via extensive experiments conducted on both real and synthetic data with comparisons to state-of-the-art algorithms.


Machine Learning on Volatile Instances

arXiv.org Machine Learning

Due to the massive size of the neural network models and training datasets used in machine learning today, it is imperative to distribute stochastic gradient descent (SGD) by splitting up tasks such as gradient evaluation across multiple worker nodes. However, running distributed SGD can be prohibitively expensive because it may require specialized computing resources such as GPUs for extended periods of time. We propose cost-effective strategies to exploit volatile cloud instances that are cheaper than standard instances, but may be interrupted by higher priority workloads. To the best of our knowledge, this work is the first to quantify how variations in the number of active worker nodes (as a result of preemption) affects SGD convergence and the time to train the model. By understanding these trade-offs between preemption probability of the instances, accuracy, and training time, we are able to derive practical strategies for configuring distributed SGD jobs on volatile instances such as Amazon EC2 spot instances and other preemptible cloud instances. Experimental results show that our strategies achieve good training performance at substantially lower cost.


Improved Binary Artificial Bee Colony Algorithm

arXiv.org Artificial Intelligence

The Artificial Bee Colony (ABC) algorithm is an evolutionary optimization algorithm based on swarm intelligence and inspired by the honey bees' food search behavior. Since the ABC algorithm has been developed to achieve optimal solutions by searching in the continuous search space, modification is required to apply this method to binary optimization problems. In this paper, we improve the ABC algorithm to solve binary optimization problems and call it the improved binary Artificial Bee Colony (ibinABC). The proposed method consists of an update mechanism based on fitness values and processing different number of decision variables. Thus, we aim to prevent the ABC algorithm from getting stuck in a local minimum by increasing its exploration ability. We compare the ibinABC algorithm with three variants of the ABC and other meta-heuristic algorithms in the literature. For comparison, we use the wellknown OR-Library dataset containing 15 problem instances prepared for the uncapacitated facility location problem. Computational results show that the proposed method is superior to other methods in terms of convergence speed and robustness. The source code of the algorithm will be available on GitHub after reviewing process


A modified single and multi-objective bacteria foraging optimization for the solution of quadratic assignment problem

arXiv.org Artificial Intelligence

Non-polynomial hard (NP-hard) problems are challenging because no polynomial-time algorithm has yet been discovered to solve them in polynomial time. The Bacteria Foraging Optimization (BFO) algorithm is one of the metaheuristics algorithms that is mostly used for NP-hard problems. BFO is inspired by the behavior of the bacteria foraging such as Escherichia coli (E-coli). The aim of BFO is to eliminate those bacteria that have weak foraging properties and maintain those bacteria that have breakthrough foraging properties toward the optimum. Despite the strength of this algorithm, most of the problems reaching optimal solutions are time-demanding or impossible. In this paper, we modified single objective BFO by adding a mutation operator and multi-objective BFO (MOBFO) by adding mutation and crossover from genetic algorithm operators to update the solutions in each generation, and local tabu search algorithm to reach the local optimum solution. Additionally, we used a fast nondominated sort algorithm in MOBFO to find the best-nondominated solutions in each generation. We evaluated the performance of the proposed algorithms through a number of single and multi-objective Quadratic Assignment Problem (QAP) instances. The experimental results show that our approaches outperform some previous optimization algorithms in both convergent and divergent solutions.


Model-Free Algorithm and Regret Analysis for MDPs with Peak Constraints

arXiv.org Machine Learning

In the optimization of dynamic systems, the variables typically have constraints. Such problems can be modeled as a constrained Markov Decision Process (MDP). This paper considers a model-free approach to the problem, where the transition probabilities are not known. In the presence of peak constraints, the agent has to choose the policy to maximize the long-term average reward as well as satisfy the constraints at each time. We propose modifications to the standard Q-learning problem for unconstrained optimization to come up with an algorithm with peak constraints. The proposed algorithm is shown to achieve $O(T^{1/2+\gamma})$ regret bound for the obtained reward, and $O(T^{1-\gamma})$ regret bound for the constraint violation for any $\gamma \in(0,1/2)$ and time-horizon $T$. We note that these are the first results on regret analysis for constrained MDP, where the transition problems are not known apriori. We demonstrate the proposed algorithm on an energy harvesting problem where it outperforms state-of-the-art and performs close to the theoretical upper bound of the studied optimization problem.


Stochastic Coordinate Minimization with Progressive Precision for Stochastic Convex Optimization

arXiv.org Machine Learning

A framework based on iterative coordinate minimization (CM) is developed for stochastic convex optimization. Given that exact coordinate minimization is impossible due to the unknown stochastic nature of the objective function, the crux of the proposed optimization algorithm is an optimal control of the minimization precision in each iteration. We establish the optimal precision control and the resulting order-optimal regret performance for strongly convex and separably nonsmooth functions. An interesting finding is that the optimal progression of precision across iterations is independent of the low-dimensional CM routine employed, suggesting a general framework for extending low-dimensional optimization routines to high-dimensional problems. The proposed algorithm is amenable to online implementation and inherits the scalability and parallelizability properties of CM for large-scale optimization. Requiring only a sublinear order of message exchanges, it also lends itself well to distributed computing as compared with the alternative approach of coordinate gradient descent.


Time-varying Gaussian Process Bandit Optimization with Non-constant Evaluation Time

arXiv.org Machine Learning

The Gaussian process bandit is a problem in which we want to find a maximizer of a black-box function with the minimum number of function evaluations. If the black-box function varies with time, then time-varying Bayesian optimization is a promising framework. However, a drawback with current methods is in the assumption that the evaluation time for every observation is constant, which can be unrealistic for many practical applications, e.g., recommender systems and environmental monitoring. As a result, the performance of current methods can be degraded when this assumption is violated. To cope with this problem, we propose a novel time-varying Bayesian optimization algorithm that can effectively handle the non-constant evaluation time. Furthermore, we theoretically establish a regret bound of our algorithm. Our bound elucidates that a pattern of the evaluation time sequence can hugely affect the difficulty of the problem. We also provide experimental results to validate the practical effectiveness of the proposed method.


ENTMOOT: A Framework for Optimization over Ensemble Tree Models

arXiv.org Artificial Intelligence

Gradient boosted trees and other regression tree models perform well in a wide range of real-world, industrial applications. These tree models (i) offer insight into important prediction features, (ii) effectively manage sparse data, and (iii) have excellent prediction capabilities. Despite their advantages, they are generally unpopular for decision-making tasks and black-box optimization, which is due to their difficult-to-optimize structure and the lack of a reliable uncertainty measure. ENTMOOT is our new framework for integrating (already trained) tree models into larger optimization problems. The contributions of ENTMOOT include: (i) explicitly introducing a reliable uncertainty measure that is compatible with tree models, (ii) solving the larger optimization problems that incorporate these uncertainty aware tree models, (iii) proving that the solutions are globally optimal, i.e. no better solution exists. In particular, we show how the ENTMOOT approach allows a simple integration of tree models into decision-making and black-box optimization, where it proves as a strong competitor to commonly-used frameworks.


Composition of kernel and acquisition functions for High Dimensional Bayesian Optimization

arXiv.org Machine Learning

Bayesian Optimization has become the reference method for the global optimization of black box, expensive and possibly noisy functions. Bayesian Op-timization learns a probabilistic model about the objective function, usually a Gaussian Process, and builds, depending on its mean and variance, an acquisition function whose optimizer yields the new evaluation point, leading to update the probabilistic surrogate model. Despite its sample efficiency, Bayesian Optimiza-tion does not scale well with the dimensions of the problem. The optimization of the acquisition function has received less attention because its computational cost is usually considered negligible compared to that of the evaluation of the objec-tive function. Its efficient optimization is often inhibited, particularly in high di-mensional problems, by multiple extrema. In this paper we leverage the addition-ality of the objective function into mapping both the kernel and the acquisition function of the Bayesian Optimization in lower dimensional subspaces. This ap-proach makes more efficient the learning/updating of the probabilistic surrogate model and allows an efficient optimization of the acquisition function. Experi-mental results are presented for real-life application, that is the control of pumps in urban water distribution systems.


Learning to be Global Optimizer

arXiv.org Artificial Intelligence

The advancement of artificial intelligence has cast a new light on the development of optimization algorithm. This paper proposes to learn a two-phase (including a minimization phase and an escaping phase) global optimization algorithm for smooth non-convex functions. For the minimization phase, a model-driven deep learning method is developed to learn the update rule of descent direction, which is formalized as a nonlinear combination of historical information, for convex functions. We prove that the resultant algorithm with the proposed adaptive direction guarantees convergence for convex functions. Empirical study shows that the learned algorithm significantly outperforms some well-known classical optimization algorithms, such as gradient descent, conjugate descent and BFGS, and performs well on ill-posed functions. The escaping phase from local optimum is modeled as a Markov decision process with a fixed escaping policy. We further propose to learn an optimal escaping policy by reinforcement learning. The effectiveness of the escaping policies is verified by optimizing synthesized functions and training a deep neural network for CIFAR image classification. The learned two-phase global optimization algorithm demonstrates a promising global search capability on some benchmark functions and machine learning tasks.