Goto

Collaborating Authors

 Optimization


Optimization in SMT with LA(Q) Cost Functions

arXiv.org Artificial Intelligence

In the contexts of automated reasoning and formal verification, important decision problems are effectively encoded into Satisfiability Modulo Theories (SMT). In the last decade efficient SMT solvers have been developed for several theories of practical interest (e.g., linear arithmetic, arrays, bit-vectors). Surprisingly, very few work has been done to extend SMT to deal with optimization problems; in particular, we are not aware of any work on SMT solvers able to produce solutions which minimize cost functions over arithmetical variables. This is unfortunate, since some problems of interest require this functionality. In this paper we start filling this gap. We present and discuss two general procedures for leveraging SMT to handle the minimization of LA(Q) cost functions, combining SMT with standard minimization techniques. We have implemented the procedures within the MathSAT SMT solver. Due to the absence of competitors in AR and SMT domains, we have experimentally evaluated our implementation against state-of-the-art tools for the domain of linear generalized disjunctive programming (LGDP), which is closest in spirit to our domain, on sets of problems which have been previously proposed as benchmarks for the latter tools. The results show that our tool is very competitive with, and often outperforms, these tools on these problems, clearly demonstrating the potential of the approach.


A Reconstruction Error Formulation for Semi-Supervised Multi-task and Multi-view Learning

arXiv.org Machine Learning

A significant challenge to make learning techniques more sui table for general purpose use is to move beyond i) complete supervision, ii) low dimensional data, iii) a single t ask and single view per instance. Solving these challenges a llows working with "Big Data" problems that are typically high dim ensional with multiple (but possibly incomplete) labeling s and views. While other work has addressed each of these probl ems separately, in this paper we show how to address them together, namelysemi-supervised dimension reduction for multi-task and multi-view learning (SSDR-MML), which performs optimization for dimension reduction and label inference in semi-supervised setting. The proposed framework is designed to handle both multi-task and multi-view learning settings, and can be easily adapted to many useful applications. Inform ation obtained from all tasks and views is combined via reconstruction errors in a linear fashion that can be efficiently solvedusing an alternating optimization scheme. Our formulation has a number of advantages. W e explicitly model the information combining mechanism as a data structure (a weight/nearest-nei ghbor matrix) which allows investigating fundamental ques tions in multi-task and multi-view learning. W e address one such question by presenting a general measure to quantify the success of simultaneous learning of multiple tasks or from multiple views. W e show that our SSDR-MML approach can outperform many state-of-the-art baseline methods and demonstrate the effectiveness of connecting dimension reduction and learning.


Sparse Non Gaussian Component Analysis by Semidefinite Programming

arXiv.org Machine Learning

Sparse non-Gaussian component analysis (SNGCA) is an unsupervised method of extracting a linear structure from a high dimensional data based on estimating a low-dimensional non-Gaussian data component. In this paper we discuss a new approach to direct estimation of the projector on the target space based on semidefinite programming which improves the method sensitivity to a broad variety of deviations from normality. We also discuss the procedures which allows to recover the structure when its effective dimension is unknown.


Distance-Based Bias in Model-Directed Optimization of Additively Decomposable Problems

arXiv.org Artificial Intelligence

For many optimization problems it is possible to define a distance metric between problem variables that correlates with the likelihood and strength of interactions between the variables. For example, one may define a metric so that the dependencies between variables that are closer to each other with respect to the metric are expected to be stronger than the dependencies between variables that are further apart. The purpose of this paper is to describe a method that combines such a problem-specific distance metric with information mined from probabilistic models obtained in previous runs of estimation of distribution algorithms with the goal of solving future problem instances of similar type with increased speed, accuracy and reliability. While the focus of the paper is on additively decomposable problems and the hierarchical Bayesian optimization algorithm, it should be straightforward to generalize the approach to other model-directed optimization techniques and other problem classes. Compared to other techniques for learning from experience put forward in the past, the proposed technique is both more practical and more broadly applicable.


Optimistic Optimization of a Deterministic Function without the Knowledge of its Smoothness

Neural Information Processing Systems

We consider a global optimization problem of a deterministic function f in a semimetric space, given a finite budget ofnevaluations. The functionf is assumed to be locally smooth (around one of its global maxima) with respect to a semi-metric l. We describe two algorithms based on optimistic exploration that use a hierarchical partitioning of the space at all scales. A first contribution is an algorithm, DOO, that requires the knowledge of l. We report a finite-sample performance bound in terms of a measure of the quantity of near-optimal states. We then define a second algorithm, SOO, which does not require the knowledge of the semimetric l under which f is smooth, and whose performance is almost as good as DOO optimally-fitted.


Optimistic Optimization of a Deterministic Function without the Knowledge of its Smoothness

Neural Information Processing Systems

We consider a global optimization problem of a deterministic function f in a semimetric space, given a finite budget ofnevaluations. The functionf is assumed to be locally smooth (around one of its global maxima) with respect to a semi-metric l. We describe two algorithms based on optimistic exploration that use a hierarchical partitioning of the space at all scales. A first contribution is an algorithm, DOO, that requires the knowledge of l. We report a finite-sample performance bound in terms of a measure of the quantity of near-optimal states. We then define a second algorithm, SOO, which does not require the knowledge of the semimetric l under which f is smooth, and whose performance is almost as good as DOO optimally-fitted.


Approximating Semidefinite Programs in Sublinear Time

Neural Information Processing Systems

In recent years semidefinite optimization has become a tool of major importance in various optimization and machine learning problems. In many of these problems the amount of data in practice is so large that there is a constant need for faster algorithms. In this work we present the first sublinear time approximation algorithm for semidefinite programs which we believe may be useful for such problems in which the size of data may cause even linear time algorithms to have prohibitive running times in practice. We present the algorithm and its analysis alongside with some theoretical lower bounds and an improved algorithm for the special problem of supervised learning of a distance metric.


Optimistic Optimization of a Deterministic Function without the Knowledge of its Smoothness

Neural Information Processing Systems

We consider a global optimization problem of a deterministic function f in a semimetric space,given a finite budget ofnevaluations. The functionf is assumed to be locally smooth (around one of its global maxima) with respect to a semi-metric l. We describe two algorithms based on optimistic exploration that use a hierarchical partitioningof the space at all scales. A first contribution is an algorithm, DOO, that requires the knowledge of l. We report a finite-sample performance bound in terms of a measure of the quantity of near-optimal states. We then define a second algorithm, SOO, which does not require the knowledge of the semimetric lunder which f is smooth, and whose performance is almost as good as DOO optimally-fitted.


Clustered Multi-Task Learning Via Alternating Structure Optimization

Neural Information Processing Systems

Multi-task learning (MTL) learns multiple related tasks simultaneously to improve generalization performance. Alternating structure optimization (ASO) is a popular MTL method that learns a shared low-dimensional predictive structure on hypothesis spaces from multiple related tasks. It has been applied successfully in many real world applications. As an alternative MTL approach, clustered multi-task learning (CMTL) assumes that multiple tasks follow a clustered structure, i.e., tasks are partitioned into a set of groups where tasks in the same group are similar to each other, and that such a clustered structure is unknown a priori. The objectives in ASO and CMTL differ in how multiple tasks are related. Interestingly, we show in this paper the equivalence relationship between ASO and CMTL, providing significant new insights into ASO and CMTL as well as their inherent relationship. The CMTL formulation is non-convex, and we adopt a convex relaxation to the CMTL formulation. We further establish the equivalence relationship between the proposed convex relaxation of CMTL and an existing convex relaxation of ASO, and show that the proposed convex CMTL formulation is significantly more efficient especially for high-dimensional data. In addition, we present three algorithms for solving the convex CMTL formulation. We report experimental results on benchmark datasets to demonstrate the efficiency of the proposed algorithms.


RTRMC: A Riemannian trust-region method for low-rank matrix completion

Neural Information Processing Systems

We consider large matrices of low rank. We address the problem of recovering such matrices when most of the entries are unknown. Matrix completion finds applications in recommender systems. In this setting, the rows of the matrix may correspond to items and the columns may correspond to users. The known entries are the ratings given by users to some items. The aim is to predict the unobserved ratings. This problem is commonly stated in a constrained optimization framework. We follow an approach that exploits the geometry of the low-rank constraint to recast the problem as an unconstrained optimization problem on the Grassmann manifold. We then apply first- and second-order Riemannian trust-region methods to solve it. The cost of each iteration is linear in the number of known entries. Our methods, RTRMC 1 and 2, outperform state-of-the-art algorithms on a wide range of problem instances.