Optimization
Parallel Direction Method of Multipliers
Huahua Wang, Arindam Banerjee, Zhi-Quan Luo
We consider the problem of minimizing block-separable (non-smooth) convex functions subject to linear constraints. While the Alternating Direction Method of Multipliers (ADMM) for two-block linear constraints has been intensively studied both theoretically and empirically, in spite of some preliminary work, effective generalizations of ADMM to multiple blocks is still unclear. In this paper, we propose a parallel randomized block coordinate method named Parallel Direction Method of Multipliers (PDMM) to solve optimization problems with multi-block linear constraints. At each iteration, PDMM randomly updates some blocks in parallel, behaving like parallel randomized block coordinate descent. We establish the global convergence and the iteration complexity for PDMM with constant step size. We also show that PDMM can do randomized block coordinate descent on overlapping blocks. Experimental results show that PDMM performs better than state-of-the-arts methods in two applications, robust principal component analysis and overlapping group lasso.
Beyond the Birkhoff Polytope: Convex Relaxations for Vector Permutation Problems
We modify the recent convex formulation of the 2-SUM problem introduced by Fogel et al. [2] to use this polytope, and demonstrate how we can attain results of similar quality in significantly less computational time for large n . To our knowledge, this is the first usage of Goemans' compact formulation of the permutahedron in a convex optimization problem. We also introduce a simpler regularization scheme for this convex formulation of the 2-SUM problem that yields good empirical results.
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The paper analyzes ADMM with the quadratic term in the augmented Lagrangian replaced by a Bregman divergence. Authors prove convergence rate results where L2 distance (of the standard ADMM guarantee) is replaced by the corresponding Bregman divergence. Similar to mirror descent, this can yield an improvement in the dimension-dependent constants of convergence bounds. Authors then demonstrate the utility of their algorithm on synthetic instances of a specific optimization problem (mass transportation problem).
Bregman Alternating Direction Method of Multipliers
The mirror descent algorithm (MDA) generalizes gradient descent by using a Bregman divergence to replace squared Euclidean distance. In this paper, we similarly generalize the alternating direction method of multipliers (ADMM) to Bregman ADMM (BADMM), which allows the choice of different Bregman divergences to exploit the structure of problems. BADMM provides a unified framework for ADMM and its variants, including generalized ADMM, inexact ADMM and Bethe ADMM. We establish the global convergence and the O (1 /T) iteration complexity for BADMM. In some cases, BADMM can be faster than ADMM by a factor of O (n/ ln n) where n is the dimensionality. In solving the linear program of mass transportation problem, BADMM leads to massive parallelism and can easily run on GPU. BADMM is several times faster than highly optimized commercial software Gurobi.
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The paper under review'Predictive Entropy Search for Efficient Global Optimization of Black-box Functions' presents an approach for Bayesian optimization that measures entropy in the posterior predictive distribution. The authors present a method named Predictive Entropy Search (PES) that is derived by a reparameterization of the expected information gain based on the symmetry of mutual information. The authors describe the posterior sampling, predictive entropy approximation and hyper-parameter learning steps of PES. The advantages of the proposed method are that it is more efficient and accurate than alternatives and that it allows to handle the hyper-parameters in a fully Bayesian way.
Predictive Entropy Search for Efficient Global Optimization of Black-box Functions
We propose a novel information-theoretic approach for Bayesian optimization called Predictive Entropy Search (PES). At each iteration, PES selects the next evaluation point that maximizes the expected information gained with respect to the global maximum. PES codifies this intractable acquisition function in terms of the expected reduction in the differential entropy of the predictive distribution. This reformulation allows PES to obtain approximations that are both more accurate and efficient than other alternatives such as Entropy Search (ES). Furthermore, PES can easily perform a fully Bayesian treatment of the model hy-perparameters while ES cannot. We evaluate PES in both synthetic and real-world applications, including optimization problems in machine learning, finance, biotechnology, and robotics. We show that the increased accuracy of PES leads to significant gains in optimization performance.
Interior Point Solving for LP-based prediction+optimisation
Solving optimization problems is the key to decision making in many real-life analytics applications. However, the coefficients of the optimization problems are often uncertain and dependent on external factors, such as future demand or energy or stock prices. Machine learning (ML) models, especially neural networks, are increasingly being used to estimate these coefficients in a data-driven way.