Optimization
A tractable ellipsoidal approximation for voltage regulation problems
Li, Pan, Jin, Baihong, Xiong, Ruoxuan, Wang, Dai, Sangiovanni-Vincentelli, Alberto, Zhang, Baosen
We present a machine learning approach to the solution of chance constrained optimizations in the context of voltage regulation problems in power system operation. The novelty of our approach resides in approximating the feasible region of uncertainty with an ellipsoid. We formulate this problem using a learning model similar to Support Vector Machines (SVM) and propose a sampling algorithm that efficiently trains the model. We demonstrate our approach on a voltage regulation problem using standard IEEE distribution test feeders.
Robust Influence Maximization for Hyperparametric Models
Kalimeris, Dimitris, Kaplun, Gal, Singer, Yaron
In this paper we study the problem of robust influence maximization in the independent cascade model under a hyperparametric assumption. In social networks users influence and are influenced by individuals with similar characteristics and as such they are associated with some features. A recent surging research direction in influence maximization focuses on the case where the edge probabilities on the graph are not arbitrary but are generated as a function of the features of the users and a global hyperparameter. We propose a model where the objective is to maximize the worst-case number of influenced users for any possible value of that hyperparameter. We provide theoretical results showing that proper robust solution in our model is NP-hard and an algorithm that achieves improper robust optimization. We make-use of sampling based techniques and of the renowned multiplicative weight updates algorithm. Additionally we validate our method empirically and prove that it outperforms the state-of-the-art robust influence maximization techniques.
Learning Self-Game-Play Agents for Combinatorial Optimization Problems
Recent progress in reinforcement learning (RL) using self-game-play has shown remarkable performance on several board games (e.g., Chess and Go) as well as video games (e.g., Atari games and Dota2). It is plausible to consider that RL, starting from zero knowledge, might be able to gradually approximate a winning strategy after a certain amount of training. In this paper, we explore neural Monte-Carlo-Tree-Search (neural MCTS), an RL algorithm which has been applied successfully by DeepMind to play Go and Chess at a super-human level. We try to leverage the computational power of neural MCTS to solve a class of combinatorial optimization problems. Following the idea of Hintikka's Game-Theoretical Semantics, we propose the Zermelo Gamification (ZG) to transform specific combinatorial optimization problems into Zermelo games whose winning strategies correspond to the solutions of the original optimization problem. The ZG also provides a specially designed neural MCTS. We use a combinatorial planning problem for which the ground-truth policy is efficiently computable to demonstrate that ZG is promising.
Practical Mathematical Optimization - Basic Optimization Theory and Gradient-Based Algorithms Jan A Snyman Springer
This textbook presents a wide range of tools for a course in mathematical optimization for upper undergraduate and graduate students in mathematics, engineering, computer science, and other applied sciences. Basic optimization principles are presented with emphasis on gradient-based numerical optimization strategies and algorithms for solving both smooth and noisy discontinuous optimization problems. Attention is also paid to the difficulties of expense of function evaluations and the existence of multiple minima that often unnecessarily inhibit the use of gradient-based methods. This second edition addresses further advancements of gradient-only optimization strategies to handle discontinuities in objective functions. New chapters discuss the construction of surrogate models as well as new gradient-only solution strategies and numerical optimization using Python.
Self-Tuning Networks: Bilevel Optimization of Hyperparameters using Structured Best-Response Functions
MacKay, Matthew, Vicol, Paul, Lorraine, Jon, Duvenaud, David, Grosse, Roger
Hyperparameter optimization can be formulated as a bilevel optimization problem, where the optimal parameters on the training set depend on the hyperparameters. We aim to adapt regularization hyperparameters for neural networks by fitting compact approximations to the best-response function, which maps hyperparameters to optimal weights and biases. We show how to construct scalable best-response approximations for neural networks by modeling the best-response as a single network whose hidden units are gated conditionally on the regularizer. We justify this approximation by showing the exact best-response for a shallow linear network with L2-regularized Jacobian can be represented by a similar gating mechanism. We fit this model using a gradient-based hyperparameter optimization algorithm which alternates between approximating the best-response around the current hyperparameters and optimizing the hyperparameters using the approximate best-response function. Unlike other gradient-based approaches, we do not require differentiating the training loss with respect to the hyperparameters, allowing us to tune discrete hyperparameters, data augmentation hyperparameters, and dropout probabilities. Because the hyperparameters are adapted online, our approach discovers hyperparameter schedules that can outperform fixed hyperparameter values. Empirically, our approach outperforms competing hyperparameter optimization methods on large-scale deep learning problems. We call our networks, which update their own hyperparameters online during training, Self-Tuning Networks (STNs).
Nonlinear input design as optimal control of a Hamiltonian system
Umenberger, Jack, Schön, Thomas B.
We propose an input design method for a general class of parametric probabilistic models, including nonlinear dynamical systems with process noise. The goal of the procedure is to select inputs such that the parameter posterior distribution concentrates about the true value of the parameters; however, exact computation of the posterior is intractable. By representing (samples from) the posterior as trajectories from a certain Hamiltonian system, we transform the input design task into an optimal control problem. The method is illustrated via numerical examples, including MRI pulse sequence design.
Mixture Modeling of Global Shape Priors and Autoencoding Local Intensity Priors for Left Atrium Segmentation
Sodergren, Tim, Bhalodia, Riddhish, Whitaker, Ross, Cates, Joshua, Marrouche, Nassir, Elhabian, Shireen
Difficult image segmentation problems, for instance left atrium MRI, can be addressed by incorporating shape priors to find solutions that are consistent with known objects. Nonetheless, a single multivariate Gaussian is not an adequate model in cases with significant nonlinear shape variation or where the prior distribution is multimodal. Nonparametric density estimation is more general, but has a ravenous appetite for training samples and poses serious challenges in optimization, especially in high dimensional spaces. Here, we propose a maximum-a-posteriori formulation that relies on a generative image model by incorporating both local intensity and global shape priors. We use deep autoencoders to capture the complex intensity distribution while avoiding the careful selection of hand-crafted features. We formulate the shape prior as a mixture of Gaussians and learn the corresponding parameters in a high-dimensional shape space rather than pre-projecting onto a low-dimensional subspace. In segmentation, we treat the identity of the mixture component as a latent variable and marginalize it within a generalized expectation-maximization framework. We present a conditional maximization-based scheme that alternates between a closed-form solution for component-specific shape parameters that provides a global update-based optimization strategy, and an intensity-based energy minimization that translates the global notion of a nonlinear shape prior into a set of local penalties. We demonstrate our approach on the left atrial segmentation from gadolinium-enhanced MRI, which is useful in quantifying the atrial geometry in patients with atrial fibrillation.
Coping with Large Traffic Volumes in Schedule-Driven Traffic Signal Control
Hu, Hsu-Chieh, Smith, Stephen F.
Recent work in decentralized, schedule-driven traffic control has demonstrated the ability to significantly improve traffic flow efficiency in complex urban road networks. However, in situations where vehicle volumes increase to the point that the physical capacity of a road network reaches or exceeds saturation, it has been observed that the effectiveness of a schedule-driven approach begins to degrade, leading to progressively higher network congestion. In essence, the traffic control problem becomes less of a scheduling problem and more of a queue management problem in this circumstance. In this paper we propose a composite approach to real-time traffic control that uses sensed information on queue lengths to influence scheduling decisions and gracefully shift the signal control strategy to queue management in high volume/high congestion settings. Specifically, queue-length information is used to establish weights for the sensed vehicle clusters that must be scheduled through a given intersection at any point, and hence bias the wait time minimization calculation. To compute these weights, we develop a model in which successive movement phases are viewed as different states of an Ising model, and parameters quantify strength of interactions. To ensure scalability, queue information is only exchanged between direct neighbors and the asynchronous nature of local intersection scheduling is preserved. We demonstrate the potential of the approach through microscopic traffic simulation of a real-world road network, showing a 60% reduction in average wait times over the baseline schedule-driven approach in heavy traffic scenarios. We also report initial field test results, which show the ability to reduce queues during heavy traffic periods.
Inertial Block Mirror Descent Method for Non-Convex Non-Smooth Optimization
Hien, Le Thi Khanh, Gillis, Nicolas, Patrinos, Panagiotis
In this paper, we propose inertial versions of block coordinate descent methods for solving non-convex non-smooth composite optimization problems. We use the general framework of Bregman distance functions to compute the proximal maps. Our method not only allows using two different extrapolation points to evaluate gradients and adding the inertial force, but also takes advantage of randomly picking the block of variables to update. Moreover, our method does not require a restarting step, and as such, it is not a monotonically decreasing method. To prove the convergence of the whole generated sequence to a critical point, we modify the convergence proof recipe of Bolte, Sabach and Teboulle (Proximal alternating linearized minimization for non-convex and non-smooth problems, Math. Prog. 146(1):459--494, 2014), and combine it with auxiliary functions. We deploy the proposed methods to solve non-negative matrix factorization (NMF) problems and show that they compete favourably with the state-of-the-art NMF algorithms.
Using a Segmenting Description approach in Multiple Criteria Decision Aiding
Kadzinski, Milosz, Badura, Jan, Figueira, Jose Rui
We propose a new method for analyzing a set of parameters in a multiple criteria ranking method. Unlike the existing techniques, we do not use any optimization technique, instead incorporating and extending a Segmenting Description approach. While considering a value-based preference disaggregation method, we demonstrate the usefulness of the introduced algorithm in a multi-purpose decision analysis exploiting a system of inequalities that models the Decision Maker's preferences. Specifically, we discuss how it can be applied for verifying the consistency between the revealed and estimated preferences as well as for identifying the sources of potential incoherence. Moreover, we employ the method for conducting robustness analysis, i.e., discovering a set of all compatible parameter values and verifying the stability of suggested recommendation in view of multiplicity of feasible solutions. In addition, we make clear its suitability for generating arguments about the validity of outcomes and the role of particular criteria. We discuss the favorable characteristics of the Segmenting Description approach which enhance its suitability for use in Multiple Criteria Decision Aiding. These include keeping in memory an entire process of transforming a system of inequalities and avoiding the need for processing the inequalities contained in the basic system which is subsequently enriched with some hypothesis to be verified. The applicability of the proposed method is exemplified on a numerical study.