Goto

Collaborating Authors

 Optimization


Augmentative Message Passing for Traveling Salesman Problem and Graph Partitioning

Neural Information Processing Systems

The cutting plane method is an augmentative constrained optimization procedure that is often used with continuous-domain optimization techniques such as linear and convex programs. We investigate the viability of a similar idea within message passing -- for integral solutions -- in the context of two combinatorial problems: 1) For Traveling Salesman Problem (TSP), we propose a factor-graph based on Held-Karp formulation, with an exponential number of constraint factors, each of which has an exponential but sparse tabular form. 2) For graph-partitioning (a.k.a. community mining) using modularity optimization, we introduce a binary variable model with a large number of constraints that enforce formation of cliques. In both cases we are able to derive surprisingly simple message updates that lead to competitive solutions on benchmark instances. In particular for TSP we are able to find near-optimal solutions in the time that empirically grows with $N^3$, demonstrating that augmentation is practical and efficient.


Parallel Direction Method of Multipliers

Neural Information Processing Systems

We consider the problem of minimizing block-separable (non-smooth) convex functions subject to linear constraints. While the Alternating Direction Method of Multipliers (ADMM) for two-block linear constraints has been intensively studied both theoretically and empirically, in spite of some preliminary work, effective generalizations of ADMM to multiple blocks is still unclear. In this paper, we propose a parallel randomized block coordinate method named Parallel Direction Method of Multipliers (PDMM) to solve optimization problems with multi-block linear constraints. At each iteration, PDMM randomly updates some blocks in parallel, behaving like parallel randomized block coordinate descent. We establish the global convergence and the iteration complexity for PDMM with constant step size. We also show that PDMM can do randomized block coordinate descent on overlapping blocks. Experimental results show that PDMM performs better than state-of-the-arts methods in two applications, robust principal component analysis and overlapping group lasso.


Sparse Space-Time Deconvolution for Calcium Image Analysis

Neural Information Processing Systems

We describe a unified formulation and algorithm to find an extremely sparse representation for Calcium image sequences in terms of cell locations, cell shapes, spike timings and impulse responses. Solution of a single optimization problem yields cell segmentations and activity estimates that are on par with the state of the art, without the need for heuristic pre- or postprocessing. Experiments on real and synthetic data demonstrate the viability of the proposed method.


An Exact Double-Oracle Algorithm for Zero-Sum Extensive-Form Games with Imperfect Information

Journal of Artificial Intelligence Research

Developing scalable solution algorithms is one of the central problems in computational game theory. We present an iterative algorithm for computing an exact Nash equilibrium for two-player zero-sum extensive-form games with imperfect information. Our approach combines two key elements: (1) the compact sequence-form representation of extensive-form games and (2) the algorithmic framework of double-oracle methods. The main idea of our algorithm is to restrict the game by allowing the players to play only selected sequences of available actions. After solving the restricted game, new sequences are added by finding best responses to the current solution using fast algorithms. We experimentally evaluate our algorithm on a set of games inspired by patrolling scenarios, board, and card games. The results show significant runtime improvements in games admitting an equilibrium with small support, and substantial improvement in memory use even on games with large support. The improvement in memory use is particularly important because it allows our algorithm to solve much larger game instances than existing linear programming methods. Our main contributions include (1) a generic sequence-form double-oracle algorithm for solving zero-sum extensive-form games; (2) fast methods for maintaining a valid restricted game model when adding new sequences; (3) a search algorithm and pruning methods for computing best-response sequences; (4) theoretical guarantees about the convergence of the algorithm to a Nash equilibrium; (5) experimental analysis of our algorithm on several games, including an approximate version of the algorithm.


Inference for Sparse Conditional Precision Matrices

arXiv.org Machine Learning

Given $n$ i.i.d. observations of a random vector $(X,Z)$, where $X$ is a high-dimensional vector and $Z$ is a low-dimensional index variable, we study the problem of estimating the conditional inverse covariance matrix $\Omega(z) = (E[(X-E[X \mid Z])(X-E[X \mid Z])^T \mid Z=z])^{-1}$ under the assumption that the set of non-zero elements is small and does not depend on the index variable. We develop a novel procedure that combines the ideas of the local constant smoothing and the group Lasso for estimating the conditional inverse covariance matrix. A proximal iterative smoothing algorithm is used to solve the corresponding convex optimization problems. We prove that our procedure recovers the conditional independence assumptions of the distribution $X \mid Z$ with high probability. This result is established by developing a uniform deviation bound for the high-dimensional conditional covariance matrix from its population counterpart, which may be of independent interest. Furthermore, we develop point-wise confidence intervals for individual elements of the conditional inverse covariance matrix. We perform extensive simulation studies, in which we demonstrate that our proposed procedure outperforms sensible competitors. We illustrate our proposal on a S&P 500 stock price data set.


Matching Pursuit LASSO Part II: Applications and Sparse Recovery over Batch Signals

arXiv.org Machine Learning

Matching Pursuit LASSIn Part I \cite{TanPMLPart1}, a Matching Pursuit LASSO ({MPL}) algorithm has been presented for solving large-scale sparse recovery (SR) problems. In this paper, we present a subspace search to further improve the performance of MPL, and then continue to address another major challenge of SR -- batch SR with many signals, a consideration which is absent from most of previous $\ell_1$-norm methods. As a result, a batch-mode {MPL} is developed to vastly speed up sparse recovery of many signals simultaneously. Comprehensive numerical experiments on compressive sensing and face recognition tasks demonstrate the superior performance of MPL and BMPL over other methods considered in this paper, in terms of sparse recovery ability and efficiency. In particular, BMPL is up to 400 times faster than existing $\ell_1$-norm methods considered to be state-of-the-art.O Part II: Applications and Sparse Recovery over Batch Signals


Modeling and Recognition of Smart Grid Faults by a Combined Approach of Dissimilarity Learning and One-Class Classification

arXiv.org Artificial Intelligence

Detecting faults in electrical power grids is of paramount importance, either from the electricity operator and consumer viewpoints. Modern electric power grids (smart grids) are equipped with smart sensors that allow to gather real-time information regarding the physical status of all the component elements belonging to the whole infrastructure (e.g., cables and related insulation, transformers, breakers and so on). In real-world smart grid systems, usually, additional information that are related to the operational status of the grid itself are collected such as meteorological information. Designing a suitable recognition (discrimination) model of faults in a real-world smart grid system is hence a challenging task. This follows from the heterogeneity of the information that actually determine a typical fault condition. The second point is that, for synthesizing a recognition model, in practice only the conditions of observed faults are usually meaningful. Therefore, a suitable recognition model should be synthesized by making use of the observed fault conditions only. In this paper, we deal with the problem of modeling and recognizing faults in a real-world smart grid system, which supplies the entire city of Rome, Italy. Recognition of faults is addressed by following a combined approach of multiple dissimilarity measures customization and one-class classification techniques. We provide here an in-depth study related to the available data and to the models synthesized by the proposed one-class classifier. We offer also a comprehensive analysis of the fault recognition results by exploiting a fuzzy set based reliability decision rule.


Optimal Triggering of Networked Control Systems

arXiv.org Machine Learning

The problem of resource allocation of nonlinear networked control systems is investigated, where, unlike the well discussed case of triggering for stability, the objective is optimal triggering. An approximate dynamic programming approach is developed for solving problems with fixed final times initially and then it is extended to infinite horizon problems. Different cases including Zero-Order-Hold, Generalized Zero-Order-Hold, and stochastic networks are investigated. Afterwards, the developments are extended to the case of problems with unknown dynamics and a model-free scheme is presented for learning the (approximate) optimal solution. After detailed analyses of convergence, optimality, and stability of the results, the performance of the method is demonstrated through different numerical examples.


Support recovery without incoherence: A case for nonconvex regularization

arXiv.org Machine Learning

We demonstrate that the primal-dual witness proof method may be used to establish variable selection consistency and $\ell_\infty$-bounds for sparse regression problems, even when the loss function and/or regularizer are nonconvex. Using this method, we derive two theorems concerning support recovery and $\ell_\infty$-guarantees for the regression estimator in a general setting. Our results provide rigorous theoretical justification for the use of nonconvex regularization: For certain nonconvex regularizers with vanishing derivative away from the origin, support recovery consistency may be guaranteed without requiring the typical incoherence conditions present in $\ell_1$-based methods. We then derive several corollaries that illustrate the wide applicability of our method to analyzing composite objective functions involving losses such as least squares, nonconvex modified least squares for errors-in variables linear regression, the negative log likelihood for generalized linear models, and the graphical Lasso. We conclude with empirical studies to corroborate our theoretical predictions.


First order algorithms in variational image processing

arXiv.org Machine Learning

Variational methods in imaging are nowadays developing towards a quite universal and flexible tool, allowing for highly successful approaches on tasks like denoising, deblurring, inpainting, segmentation, super-resolution, disparity, and optical flow estimation. The overall structure of such approaches is of the form ${\cal D}(Ku) + \alpha {\cal R} (u) \rightarrow \min_u$ ; where the functional ${\cal D}$ is a data fidelity term also depending on some input data $f$ and measuring the deviation of $Ku$ from such and ${\cal R}$ is a regularization functional. Moreover $K$ is a (often linear) forward operator modeling the dependence of data on an underlying image, and $\alpha$ is a positive regularization parameter. While ${\cal D}$ is often smooth and (strictly) convex, the current practice almost exclusively uses nonsmooth regularization functionals. The majority of successful techniques is using nonsmooth and convex functionals like the total variation and generalizations thereof or $\ell_1$-norms of coefficients arising from scalar products with some frame system. The efficient solution of such variational problems in imaging demands for appropriate algorithms. Taking into account the specific structure as a sum of two very different terms to be minimized, splitting algorithms are a quite canonical choice. Consequently this field has revived the interest in techniques like operator splittings or augmented Lagrangians. Here we shall provide an overview of methods currently developed and recent results as well as some computational studies providing a comparison of different methods and also illustrating their success in applications.