Optimization
Distance Majorization and Its Applications
Chi, Eric C., Zhou, Hua, Lange, Kenneth
The problem of minimizing a continuously differentiable convex function over an intersection of closed convex sets is ubiquitous in applied mathematics. It is particularly interesting when it is easy to project onto each separate set, but nontrivial to project onto their intersection. Algorithms based on Newton's method such as the interior point method are viable for small to medium-scale problems. However, modern applications in statistics, engineering, and machine learning are posing problems with potentially tens of thousands of parameters or more. We revisit this convex programming problem and propose an algorithm that scales well with dimensionality. Our proposal is an instance of a sequential unconstrained minimization technique and revolves around three ideas: the majorization-minimization (MM) principle, the classical penalty method for constrained optimization, and quasi-Newton acceleration of fixed-point algorithms. The performance of our distance majorization algorithms is illustrated in several applications.
Planning Solar Array Operations on the International Space Station
Flight controllers manage the orientation and modes of eight large solar arrays that power the International Space Station (ISS). The task requires generating plans that balance complex constraints and preferences. These considerations include context-dependent constraints on viable solar array configurations, temporal limits on transitions between configurations, and preferences on which considerations have priority. The Solar Array Constraint Engine (SACE) treats this operations planning problem as a sequence of tractable constrained optimization problems. SACE uses constraint management and automated planning capabilities to reason about the constraints, to find optimal array configurations subject to these constraints and solution preferences, and to automatically generate solar array operations plans.
Integrated Operations (Re-)Scheduling from Mine to Ship
Sampath, Kameshwaran (IBM Research - India) | Tezabwala, Alfiya (IBM Research - India) | Chabrier, Alain (IBM Software Group) | Payne, Julain (IBM Software Group) | Tiozzo, Fabio (IBM Software Group)
Mining companies have complex supply chains that start from the mining location and stretch thousands of kilometers to the end customer in a different country and continent. The logistics of moving the materials from mines to ship is composed of series of optimization problems like berth allocation , ship scheduling , stockyard scheduling , and rail scheduling , which are individually NP-hard. In this paper, we present a scheduling application, called as IBM Optimization: Mine to Ship , for end-to-end integrated operations scheduling. The application is built on IBM ILOG ODM Enterprise with advanced features like rescheduling under deviations and disturbances, and maintenance scheduling. The modeling and computational complexity of integrated scheduling optimization is tamed using hybrid optimization technique that leverages mathematical programming and constraint programming. The application will benefit the mining companies with increased resource usage, higher throughput, reduced cost of operations, and higher revenue.
Planning Personalised Museum Visits
Berre, Daniel Le (CRIL - CNRS UMR 8188 Université d'Artois) | Marquis, Pierre (CRIL - CNRS UMR 8188 Université d'Artois) | Roussel, Stéphanie (CRIL - CNRS UMR 8188 Université d'Artois)
In this paper, we consider the problem of designing personalised museum visits. Given a set of preferences and constraints a visitor might express on her visit, the aim is to compute the tour that best matches her requirements. The museum visits problem can be expressed as a planning problem, with cost optimization. We show how to bound the number of steps required to find an optimal solution, via the resolution of an instance of the shortest complete walk problem. We also point out an alternative encoding of the museum visits problem as an optimization problem with pseudo-Boolean constraints and a linear objective function. We have evaluated several constraints solvers, a planner and a tailored solver on a number of benchmarks, representing various instances of the museum visits problem corresponding to real museums. Our empirical results show the feasibility of both the planning and the constraint programming approaches. Optimal solutions can be computed for short visits and ``practically good'' solutions for much longer visits.
Fast greedy algorithm for subspace clustering from corrupted and incomplete data
Petukhov, Alexander, Kozlov, Inna
We describe the Fast Greedy Sparse Subspace Clustering (FGSSC) algorithm providing an efficient method for clustering data belonging to a few low-dimensional linear or affine subspaces. The main difference of our algorithm from predecessors is its ability to work with noisy data having a high rate of erasures (missed entries with the known coordinates) and errors (corrupted entries with unknown coordinates). We discuss here how to implement the fast version of the greedy algorithm with the maximum efficiency whose greedy strategy is incorporated into iterations of the basic algorithm. We provide numerical evidences that, in the subspace clustering capability, the fast greedy algorithm outperforms not only the existing state-of-the art SSC algorithm taken by the authors as a basic algorithm but also the recent GSSC algorithm. At the same time, its computational cost is only slightly higher than the cost of SSC. The numerical evidence of the algorithm significant advantage is presented for a few synthetic models as well as for the Extended Yale B dataset of facial images. In particular, the face recognition misclassification rate turned out to be 6-20 times lower than for the SSC algorithm. We provide also the numerical evidence that the FGSSC algorithm is able to perform clustering of corrupted data efficiently even when the sum of subspace dimensions significantly exceeds the dimension of the ambient space.
Fast Dual Variational Inference for Non-Conjugate LGMs
Khan, Mohammad Emtiyaz, Aravkin, Aleksandr Y., Friedlander, Michael P., Seeger, Matthias
Latent Gaussian models (LGMs) are widely used in statistics and machine learning. Bayesian inference in non-conjugate LGMs is difficult due to intractable integrals involving the Gaussian prior and non-conjugate likelihoods. Algorithms based on variational Gaussian (VG) approximations are widely employed since they strike a favorable balance between accuracy, generality, speed, and ease of use. However, the structure of the optimization problems associated with these approximations remains poorly understood, and standard solvers take too long to converge. We derive a novel dual variational inference approach that exploits the convexity property of the VG approximations. We obtain an algorithm that solves a convex optimization problem, reduces the number of variational parameters, and converges much faster than previous methods. Using real-world data, we demonstrate these advantages on a variety of LGMs, including Gaussian process classification, and latent Gaussian Markov random fields.
Multiclass Total Variation Clustering
Bresson, Xavier, Laurent, Thomas, Uminsky, David, von Brecht, James H.
Many clustering models rely on the minimization of an energy over possible partitions of the data set. These discrete optimizations usually pose NPhard problems, however. A natural resolution of this issue involves relaxing the discrete minimization space into a continuous one to obtain an easier minimization procedure. Many current algorithms, such as spectral clustering methods or nonnegative matrix factorization (NMF) methods, follow this relaxation approach. A fundamental problem arises when using this approach, however; in general the solution of the relaxed continuous problem and that of the discrete NPhard problem can differ substantially.
Efficient learning of simplices
Anderson, Joseph, Goyal, Navin, Rademacher, Luis
We show an efficient algorithm for the following problem: Given uniformly random points from an arbitrary n-dimensional simplex, estimate the simplex. The size of the sample and the number of arithmetic operations of our algorithm are polynomial in n. This answers a question of Frieze, Jerrum and Kannan [FJK]. Our result can also be interpreted as efficiently learning the intersection of n+1 half-spaces in R^n in the model where the intersection is bounded and we are given polynomially many uniform samples from it. Our proof uses the local search technique from Independent Component Analysis (ICA), also used by [FJK]. Unlike these previous algorithms, which were based on analyzing the fourth moment, ours is based on the third moment. We also show a direct connection between the problem of learning a simplex and ICA: a simple randomized reduction to ICA from the problem of learning a simplex. The connection is based on a known representation of the uniform measure on a simplex. Similar representations lead to a reduction from the problem of learning an affine transformation of an n-dimensional l_p ball to ICA.
Robustness Analysis of Hottopixx, a Linear Programming Model for Factoring Nonnegative Matrices
Although nonnegative matrix factorization (NMF) is NP-hard in general, it has been shown very recently that it is tractable under the assumption that the input nonnegative data matrix is close to being separable (separability requires that all columns of the input matrix belongs to the cone spanned by a small subset of these columns). Since then, several algorithms have been designed to handle this subclass of NMF problems. In particular, Bittorf, Recht, R\'e and Tropp (`Factoring nonnegative matrices with linear programs', NIPS 2012) proposed a linear programming model, referred to as Hottopixx. In this paper, we provide a new and more general robustness analysis of their method. In particular, we design a provably more robust variant using a post-processing strategy which allows us to deal with duplicates and near duplicates in the dataset.
An estimation of distribution algorithm with adaptive Gibbs sampling for unconstrained global optimization
Velasco, Jonás, Saucedo-Espinosa, Mario A., Escalante, Hugo Jair, Mendoza, Karlo, Villarreal-Rodríguez, César Emilio, Chacón-Mondragón, Óscar L., Rodríguez, Adrián, Berrones, Arturo
In this paper is proposed a new heuristic approach belonging to the field of evolutionary Estimation of Distribution Algorithms (EDAs). EDAs builds a probability model and a set of solutions is sampled from the model which characterizes the distribution of such solutions. The main framework of the proposed method is an estimation of distribution algorithm, in which an adaptive Gibbs sampling is used to generate new promising solutions and, in combination with a local search strategy, it improves the individual solutions produced in each iteration. The Estimation of Distribution Algorithm with Adaptive Gibbs Sampling we are proposing in this paper is called AGEDA. We experimentally evaluate and compare this algorithm against two deterministic procedures and several stochastic methods in three well known test problems for unconstrained global optimization. It is empirically shown that our heuristic is robust in problems that involve three central aspects that mainly determine the difficulty of global optimization problems, namely high-dimensionality, multi-modality and non-smoothness.