Goto

Collaborating Authors

 Optimization


RIOT: a Stochastic-based Method for Workflow Scheduling in the Cloud

arXiv.org Artificial Intelligence

Cloud computing provides engineers or scientists a place to run complex computing tasks. Finding a workflow's deployment configuration in a cloud environment is not easy. Traditional workflow scheduling algorithms were based on some heuristics, e.g. reliability greedy, cost greedy, cost-time balancing, etc., or more recently, the meta-heuristic methods, such as genetic algorithms. These methods are very slow and not suitable for rescheduling in the dynamic cloud environment. This paper introduces RIOT (Randomized Instance Order Types), a stochastic based method for workflow scheduling. RIOT groups the tasks in the workflow into virtual machines via a probability model and then uses an effective surrogate-based method to assess a large amount of potential scheduling. Experiments in dozens of study cases showed that RIOT executes tens of times faster than traditional methods while generating comparable results to other methods.


Global Convergence Analysis of the Flower Pollination Algorithm: A Discrete-Time Markov Chain Approach

arXiv.org Artificial Intelligence

Flower pollination algorithm is a recent metaheuristic algorithm for solving nonlinear global optimization problems. The algorithm has also been extended to solve multiobjective optimization with promising results. In this work, we analyze this algorithm mathematically and prove its convergence properties by using Markov chain theory. By constructing the appropriate transition probability for a population of flower pollen and using the homogeneity property, it can be shown that the constructed stochastic sequences can converge to the optimal set. Under the two proper conditions for convergence, it is proved that the simplified flower pollination algorithm can indeed satisfy these convergence conditions and thus the global convergence of this algorithm can be guaranteed. Numerical experiments are used to demonstrate that the flower pollination algorithm can converge quickly in practice and can thus achieve global optimality efficiently.


Attenuate Locally, Win Globally: An Attenuation-based Framework for Online Stochastic Matching with Timeouts

arXiv.org Artificial Intelligence

Online matching problems have garnered significant attention in recent years due to numerous applications in e-commerce, online advertisements, ride-sharing, etc. Many of them capture the uncertainty in the real world by including stochasticity in both the arrival process and the matching process. The Online Stochastic Matching with Timeouts problem introduced by Bansal, et al., (Algorithmica, 2012) models matching markets (e.g., E-Bay, Amazon). Buyers arrive from an independent and identically distributed (i.i.d.) known distribution on buyer profiles and can be shown a list of items one at a time. Each buyer has some probability of purchasing each item and a limit (timeout) on the number of items they can be shown. Bansal et al., (Algorithmica, 2012) gave a 0.12-competitive algorithm which was improved by Adamczyk, et al., (ESA, 2015) to 0.24. We present an online attenuation framework that uses an algorithm for offline stochastic matching as a black box. On the upper bound side, we show that this framework, combined with a black-box adapted from Bansal et al., (Algorithmica, 2012), yields an online algorithm which nearly doubles the ratio to 0.46. On the lower bound side, we show that no algorithm can achieve a ratio better than 0.632 using the standard LP for this problem. This framework has a high potential for further improvements since new algorithms for offline stochastic matching can directly improve the ratio for the online problem. Our online framework also has the potential for a variety of extensions. For example, we introduce a natural generalization: Online Stochastic Matching with Two-sided Timeouts in which both online and offline vertices have timeouts. Our framework provides the first algorithm for this problem achieving a ratio of 0.30. We once again use the algorithm of Adamczyk et al., (ESA, 2015) as a black-box and plug-it into our framework.


Autotune: A Derivative-free Optimization Framework for Hyperparameter Tuning

arXiv.org Machine Learning

Machine learning applications often require hyperparameter tuning. The hyperparameters usually drive both the efficiency of the model training process and the resulting model quality. For hyperparameter tuning, machine learning algorithms are complex black-boxes. This creates a class of challenging optimization problems, whose objective functions tend to be nonsmooth, discontinuous, unpredictably varying in computational expense, and include continuous, categorical, and/or integer variables. Further, function evaluations can fail for a variety of reasons including numerical difficulties or hardware failures. Additionally, not all hyperparameter value combinations are compatible, which creates so called hidden constraints. Robust and efficient optimization algorithms are needed for hyperparameter tuning. In this paper we present an automated parallel derivative-free optimization framework called \textbf{Autotune}, which combines a number of specialized sampling and search methods that are very effective in tuning machine learning models despite these challenges. Autotune provides significantly improved models over using default hyperparameter settings with minimal user interaction on real-world applications. Given the inherent expense of training numerous candidate models, we demonstrate the effectiveness of Autotune's search methods and the efficient distributed and parallel paradigms for training and tuning models, and also discuss the resource trade-offs associated with the ability to both distribute the training process and parallelize the tuning process.


Two Use Cases of Machine Learning for SDN-Enabled IP/Optical Networks: Traffic Matrix Prediction and Optical Path Performance Prediction

arXiv.org Machine Learning

We describe two applications of machine learning in the context of IP/Optical networks. The first one allows agile management of resources at a core IP/Optical network by using machine learning for short-term and long-term prediction of traffic flows and joint global optimization of IP and optical layers using colorless/directionless (CD) flexible ROADMs. Multilayer coordination allows for significant cost savings, flexible new services to meet dynamic capacity needs, and improved robustness by being able to proactively adapt to new traffic patterns and network conditions. The second application is important as we migrate our metro networks to Open ROADM networks, to allow physical routing without the need for detailed knowledge of optical parameters. We discuss a proof-of-concept study, where detailed performance data for wavelengths on a current flexible ROADM network is used for machine learning to predict the optical performance of each wavelength. Both applications can be efficiently implemented by using a SDN (Software Defined Network) controller.


RSG: Beating Subgradient Method without Smoothness and Strong Convexity

arXiv.org Machine Learning

In this paper, we study the efficiency of a {\bf R}estarted {\bf S}ub{\bf G}radient (RSG) method that periodically restarts the standard subgradient method (SG). We show that, when applied to a broad class of convex optimization problems, RSG method can find an $\epsilon$-optimal solution with a low complexity than SG method. In particular, we first show that RSG can reduce the dependence of SG's iteration complexity on the distance between the initial solution and the optimal set to that between the $\epsilon$-level set and the optimal set. In addition, we show the advantages of RSG over SG in solving three different families of convex optimization problems. (a) For the problems whose epigraph is a polyhedron, RSG is shown to converge linearly. (b) For the problems with local quadratic growth property, RSG has an $O(\frac{1}{\epsilon}\log(\frac{1}{\epsilon}))$ iteration complexity. (c) For the problems that admit a local Kurdyka-\L ojasiewicz property with a power constant of $\beta\in[0,1)$, RSG has an $O(\frac{1}{\epsilon^{2\beta}}\log(\frac{1}{\epsilon}))$ iteration complexity. On the contrary, with only the standard analysis, the iteration complexity of SG is known to be $O(\frac{1}{\epsilon^2})$ for these three classes of problems. The novelty of our analysis lies at exploiting the lower bound of the first-order optimality residual at the $\epsilon$-level set. It is this novelty that allows us to explore the local properties of functions (e.g., local quadratic growth property, local Kurdyka-\L ojasiewicz property, more generally local error bounds) to develop the improved convergence of RSG. We demonstrate the effectiveness of the proposed algorithms on several machine learning tasks including regression and classification.


Bayesian Metabolic Flux Analysis reveals intracellular flux couplings

arXiv.org Machine Learning

Markus Heinonen 1, 2, Maria Osmala 1, Henrik Mannerstr om 1, Janne Wallenius 3 Samuel Kaski 1, 2, Juho Rousu 1, 2 and Harri L ahdesm aki 1 1 Department of Computer Science, Aalto University, Espoo, 02150, Finland 2 Helsinki Institute for Information Technology, Finland 3 Institute for Molecular Medicine Finland, Helsinki, Finland Abstract Motivation: Metabolic flux balance analyses are a standard tool in analysing metabolic reaction rates compatible with measurements, steady-state and the metabolic reaction network stoichiometry. Flux analysis methods commonly place unrealistic assumptions on fluxes due to the convenience of formulating the problem as a linear programming model, and most methods ignore the notable uncertainty in flux estimates. Results: We introduce a novel paradigm of Bayesian metabolic flux analysis that models the reactions of the whole genome-scale cellular system in probabilistic terms, and can infer the full flux vector distribution of genome-scale metabolic systems based on exchange and intracellular (e.g. The Bayesian model couples all fluxes jointly together in a simple truncated multivariate posterior distribution, which reveals informative flux couplings. Our model is a plugin replacement to conventional metabolic balance methods, such as flux balance analysis (FBA). Our experiments indicate that we can characterise the genome-scale flux covariances, reveal flux couplings, and determine more intracellular unobserved fluxes in C. acetobutylicum from 13C data than flux variability analysis. Contact: markus.o.heinonen@aalto.fi 1 Introduction Metabolic modelling considers networks of up to thousands of chemical reactions that transform metabolite molecules within cellular organisms (Palsson, 2015). The key problem of metabolism is estimation of the reaction rates, or fluxes, of the system of the highly interdependent intracellular fluxes from measurements of few exchange fluxes that transfer nutrients or products between the external medium and the cell. The dominant approach to flux estimation is the celebrated Flux Balance Analysis (FBA) framework that finds reaction rates that maximise prespecified cellular growth function (Feist and Palsson, 2010), while assuming the cell is in a steady-state, where concentrations of intracellular metabolites do not change (Almaas et al., 2004). The FBA problem can be casted as a convenient and computationally efficient linear programming problem of solving a system of linear steady-state constraints while maximising a linear growth target (Orth et al., 2010), and where flux measurements can be encoded as constraints to the fluxes (Carreira et al., 2014).


Two-Player Games for Efficient Non-Convex Constrained Optimization

arXiv.org Machine Learning

In recent years, constrained optimization has become increasingly relevant to the machine learning community, with applications including Neyman-Pearson classification, robust optimization, and fair machine learning. A natural approach to constrained optimization is to optimize the Lagrangian, but this is not guaranteed to work in the non-convex setting. Instead, we prove that, given a Bayesian optimization oracle, a modified Lagrangian approach can be used to find a distribution over no more than m+1 models (where m is the number of constraints) that is nearly-optimal and nearly-feasible w.r.t. the original constrained problem. Interestingly, our method can be extended to non-differentiable--even discontinuous--constraints (where assuming a Bayesian optimization oracle is not realistic) by viewing constrained optimization as a non-zero-sum two-player game. The first player minimizes external regret in terms of easy-to-optimize "proxy constraints", while the second player enforces the original constraints by minimizing swap-regret.


A Multi-task Selected Learning Approach for Solving New Type 3D Bin Packing Problem

arXiv.org Artificial Intelligence

This paper studies a new type of 3D bin packing problem (BPP), in which a number of cuboid-shaped items must be put into a bin one by one orthogonally. The objective is to find a way to place these items that can minimize the surface area of the bin. This problem is based on the fact that there is no fixed-sized bin in many real business scenarios and the cost of a bin is proportional to its surface area. Based on previous research on 3D BPP, the surface area is determined by the sequence, spatial locations and orientations of items. It is a new NP-hard combinatorial optimization problem on unfixed-sized bin packing, for which we propose a multi-task framework based on Selected Learning, generating the sequence and orientations of items packed into the bin simultaneously. During training steps, Selected Learning chooses one of loss functions derived from Deep Reinforcement Learning and Supervised Learning corresponding to the training procedure. Numerical results show that the method proposed significantly outperforms Lego baselines by a substantial gain of 7.52%. Moreover, we produce large scale 3D Bin Packing order data set for studying bin packing problems and will release it to the research community.


An Adaptive Clipping Approach for Proximal Policy Optimization

arXiv.org Artificial Intelligence

Very recently proximal policy optimization (PPO) algorithms have been proposed as first-order optimization methods for effective reinforcement learning. While PPO is inspired by the same learning theory that justifies trust region policy optimization (TRPO), PPO substantially simplifies algorithm design and improves data efficiency by performing multiple epochs of \emph{clipped policy optimization} from sampled data. Although clipping in PPO stands for an important new mechanism for efficient and reliable policy update, it may fail to adaptively improve learning performance in accordance with the importance of each sampled state. To address this issue, a new surrogate learning objective featuring an adaptive clipping mechanism is proposed in this paper, enabling us to develop a new algorithm, known as PPO-$\lambda$. PPO-$\lambda$ optimizes policies repeatedly based on a theoretical target for adaptive policy improvement. Meanwhile, destructively large policy update can be effectively prevented through both clipping and adaptive control of a hyperparameter $\lambda$ in PPO-$\lambda$, ensuring high learning reliability. PPO-$\lambda$ enjoys the same simple and efficient design as PPO. Empirically on several Atari game playing tasks and benchmark control tasks, PPO-$\lambda$ also achieved clearly better performance than PPO.