Optimization
Optimal Immunization Policy Using Dynamic Programming
Alaeddini, Atiye, Klein, Daniel
Decisions in public health are almost always made in the context of uncertainty. Policy makers responsible for making important decisions are faced with the daunting task of choosing from many possible options. This task is called planning under uncertainty, and is particularly acute when addressing complex systems, such as issues of global health and development. Decision making under uncertainty is a challenging task, and all too often this uncertainty is averaged away to simplify results for policy makers. A popular way to approach this task is to formulate the problem at hand as a (partially observable) Markov decision process, (PO)MDP. This work aims to apply these AI efforts to challenging problems in health and development. In this paper, we developed a framework for optimal health policy design in a dynamic setting. We apply a stochastic dynamic programing approach to identify both the optimal time to change the health intervention policy and the optimal time to collect decision relevant information.
A flexible integer linear programming formulation for scheduling clinician on-call service in hospitals
Landsman, David, Ma, Huiting, Knight, Jesse, Gough, Kevin, Mishra, Sharmistha
Scheduling of personnel in a hospital environment is vital to improving the service provided to patients and balancing the workload assigned to clinicians. Many approaches have been tried and successfully applied to generate efficient schedules in such settings. However, due to the computational complexity of the scheduling problem in general, most approaches resort to heuristics to find a non-optimal solution in a reasonable amount of time. We designed an integer linear programming formulation to find an optimal schedule in a clinical division of a hospital. Our formulation mitigates issues related to computational complexity by minimizing the set of constraints, yet retains sufficient flexibility so that it can be adapted to a variety of clinical divisions. We then conducted a case study for our approach using data from the Infectious Diseases division at St. Michael's Hospital in Toronto, Canada. We analyzed and compared the results of our approach to manually-created schedules at the hospital, and found improved adherence to departmental constraints and clinician preferences. We used simulated data to examine the sensitivity of the runtime of our linear program for various parameters and observed reassuring results, signifying the practicality and generalizability of our approach in different real-world scenarios.
Decision Automation for Electric Power Network Recovery
Sarkale, Yugandhar, Nozhati, Saeed, Chong, Edwin K. P., Ellingwood, Bruce R.
Critical infrastructure systems such as electric power networks, water networks, and transportation systems play a major role in the welfare of any community. In the aftermath of disasters, their recovery is of paramount importance; orderly and efficient recovery involves the assignment of limited resources (a combination of human repair workers and machines) to repair damaged infrastructure components. The decision maker must also deal with uncertainty in the outcome of the resource-allocation actions during recovery. The manual assignment of resources seldom is optimal despite the expertise of the decision maker because of the large number of choices and uncertainties in consequences of sequential decisions. This combinatorial assignment problem under uncertainty is known to be \mbox{NP-hard}. We propose a novel decision technique that addresses the massive number of decision choices for large-scale real-world problems; in addition, our method also features an experiential learning component that adaptively determines the utilization of the computational resources based on the performance of a small number of choices. Our framework is closed-loop, and naturally incorporates all the attractive features of such a decision-making system. In contrast to myopic approaches, which do not account for the future effects of the current choices, our methodology has an anticipatory learning component that effectively incorporates \emph{lookahead} into the solutions. To this end, we leverage the theory of regression analysis, Markov decision processes (MDPs), multi-armed bandits, and stochastic models of community damage from natural disasters to develop a method for near-optimal recovery of communities. Our method contributes to the general problem of MDPs with massive action spaces with application to recovery of communities affected by hazards.
Demystifying Different Variants of Gradient Descent Optimization Algorithm
In this post, we have looked at the batch gradient descent, the need to develop new optimization techniques, and then we briefly discussed how to interpret contour plots. After that, we have looked at behind six different optimization techniques and three different data strategies (batch, mini-batch & stochastic) with an intuitive understanding which helps to know where to use any of these algorithms. In Practice Adam optimizer with mini-batch of sizes 32, 64 and 128 is the default choice, at least for all the image classification tasks which deal with CNN and large sequence to sequence models.
On Education Discrete Optimization Data Science Heuristic & Metaheuristic - CouponED
What is optimization Some real-life situations where we need to optimize an objective The mathematical formalism of optimization How discrete optimization (Combinatorics) differs from continuous optimization Different approaches to solve a Combinatorics problem, including-- The simplest, perfect but slow'Brute Force' method. One of the fastest and practicable'Greedy' heuristic.A look-ahead mechanism to refine the greedy approach. Travelling Salesman Problem Other generic problems in discrete optimization, like the Knapsack Problem How metaheuristic approaches compare to heuristic solutions The nature-inspired class of metaheuristic approaches Ant Colony Optimization: its basis, modus operandi, algorithm and flow chart The R library to implement Ant Colony Optimization and other heuristic solutions Examples of Travelling Salesman Problems solved through different approaches ESSENTIAL: A moderate knowledge of Mathematics (High School level) BOOSTER: Familiarity with some programming language (preferably R) BOOSTER: Interest in solving puzzles and games involving logic BOOSTER: Basic knowhow on what Data Science is about Discrete Optimization is something all of us use in our daily activities when say, we order at a restaurant, decide which subject to study, take up a new activity… or look for a change. It comprises of choosing between alternatives that best suit some objective we have in mind. When such things are formalized, i.e. the objective and the ability of each choice to fulfill that objective are quantified, we get a mathematical expression of the problem we would optimize.
A language processing algorithm for predicting tactical solutions to an operational planning problem under uncertainty
This paper is devoted to the prediction of solutions to a stochastic discrete optimization problem. Through an application, we illustrate how we can use a state-of-the-art neural machine translation (NMT) algorithm to predict the solutions by defining appropriate vocabularies, syntaxes and constraints. We attend to applications where the predictions need to be computed in very short computing time -- in the order of milliseconds or less. The results show that with minimal adaptations to the model architecture and hyperparameter tuning, the NMT algorithm can produce accurate solutions within the computing time budget. While these predictions are slightly less accurate than approximate stochastic programming solutions (sample average approximation), they can be computed faster and with less variability.
Combinatorial Losses through Generalized Gradients of Integer Linear Programs
Gao, Xi, Zhang, Han, Panahi, Aliakbar, Arodz, Tom
When samples have internal structure, we often see a mismatch between the objective optimized during training and the model's goal during inference. For example, in sequence-to-sequence modeling we are interested in high-quality translated sentences, but training typically uses maximum likelihood at the word level. Learning to recognize individual faces from group photos, each captioned with the correct but unordered list of people in it, is another example where a mismatch between training and inference objectives occurs. In both cases, the natural training-time loss would involve a combinatorial problem -- dynamic programming-based global sequence alignment and weighted bipartite graph matching, respectively -- but solutions to combinatorial problems are not differentiable with respect to their input parameters, so surrogate, differentiable losses are used instead. Here, we show how to perform gradient descent over combinatorial optimization algorithms that involve continuous parameters, for example edge weights, and can be efficiently expressed as integer, linear, or mixed-integer linear programs. We demonstrate usefulness of gradient descent over combinatorial optimization in sequence-to-sequence modeling using differentiable encoder-decoder architecture with softmax or Gumbel-softmax, and in weakly supervised learning involving a convolutional, residual feed-forward network for image classification.
Model-Agnostic Meta-Learning using Runge-Kutta Methods
Im, Daniel Jiwoong, Jiang, Yibo, Verma, Nakul
Daniel Jiwoong Im 1, Yibo Jiang 2, and Nakul Verma 3 1 Janelia Research Campus, HHMI, Virgina 2 Harvard University, Massachusetts 3 Columbia University, New York Abstract Meta learning has emerged as an important framework for learning new tasks from just a few examples. The success of any meta-learning model depends on (i) its fast adaptation to new tasks, as well as (ii) having a shared representation across similar tasks. Here we extend the model-agnostic meta-learning (MAML) framework introduced by Finn et al. (2017) to achieve improved performance by analyzing the temporal dynamics of the optimization procedure via the Runge-Kutta method. This method enables us to gain fine-grained control over the optimization and helps us achieve both the adaptation and representation goals across tasks. By leveraging this refined control, we demonstrate that there are multiple principled ways to update MAML and show that the classic MAML optimization is simply a special case of second order Runge-Kutta method that mainly focuses on fast-adaptation. Experiments on benchmark classification, regression and reinforcement learning tasks show that this refined control helps attain improved results. 1 Introduction Building an intelligent system that can learn quickly on a new task with few examples or few experiences is one of the central goals of machine learning. Achieving this goal requires an agent that learns continuously while having the ability to adapt to new tasks with limited data. Meta-learning (Biggs, 1985) has emerged as a compelling framework that strives to attain this challenging goal. There are two main approaches to meta-learning: learning-to-optimize and learning-to-initialize the meta-model (usually encoded as deep network).
Calculating Optimistic Likelihoods Using (Geodesically) Convex Optimization
Nguyen, Viet Anh, Shafieezadeh-Abadeh, Soroosh, Yue, Man-Chung, Kuhn, Daniel, Wiesemann, Wolfram
A fundamental problem arising in many areas of machine learning is the evaluation of the likelihood of a given observation under different nominal distributions. Frequently, these nominal distributions are themselves estimated from data, which makes them susceptible to estimation errors. We thus propose to replace each nominal distribution with an ambiguity set containing all distributions in its vicinity and to evaluate an \emph{optimistic likelihood}, that is, the maximum of the likelihood over all distributions in the ambiguity set. When the proximity of distributions is quantified by the Fisher-Rao distance or the Kullback-Leibler divergence, the emerging optimistic likelihoods can be computed efficiently using either geodesic or standard convex optimization techniques. We showcase the advantages of working with optimistic likelihoods on a classification problem using synthetic as well as empirical data.
Communication-Efficient Asynchronous Stochastic Frank-Wolfe over Nuclear-norm Balls
Zhuo, Jiacheng, Lei, Qi, Dimakis, Alexandros G., Caramanis, Constantine
Large-scale machine learning training suffers from two prior challenges, specifically for nuclear-norm constrained problems with distributed systems: the synchronization slowdown due to the straggling workers, and high communication costs. In this work, we propose an asynchronous Stochastic Frank Wolfe (SFW-asyn) method, which, for the first time, solves the two problems simultaneously, while successfully maintaining the same convergence rate as the vanilla SFW. We implement our algorithm in python (with MPI) to run on Amazon EC2, and demonstrate that SFW-asyn yields speed-ups almost linear to the number of machines compared to the vanilla SFW.