Goto

Collaborating Authors

 Optimization


A Fine-Grained Variant of the Hierarchy of Lasserre

arXiv.org Artificial Intelligence

There has been much recent interest in hierarchies of progressively stronger convexifications of polynomial optimisation problems (POP). These often converge to the global optimum of the POP, asymptotically, but prove challenging to solve beyond the first level in the hierarchy for modest instances. We present a finer-grained variant of the Lasserre hierarchy, together with first-order methods for solving the convexifications, which allow for efficient warm-starting with solutions from lower levels in the hierarchy.


Optimal Solution Predictions for Mixed Integer Programs

arXiv.org Artificial Intelligence

Mixed Integer Programming (MIP) is one of the most widely used modeling techniques to deal with combinatorial optimization problems. In many applications, a similar MIP model is solved on a regular basis, maintaining remarkable similarities in model structures and solution appearances but differing in formulation coefficients. This offers the opportunity for machine learning method to explore the correlations between model structures and the resulting solution values. To address this issue, we propose to represent an MIP instance using a tripartite graph, based on which a Graph Convolutional Network (GCN) is constructed to predict solution values for binary variables. The predicted solutions are used to generate a local branching cut to the model which accelerate the solution process for MIP. Computational evaluations on 8 distinct types of MIP problems show that the proposed framework improves the performance of a state-of-the-art open source MIP solver significantly in terms of running time and solution quality.


Learning with fuzzy hypergraphs: a topical approach to query-oriented text summarization

arXiv.org Artificial Intelligence

Existing graph-based methods for extractive document summarization represent sentences of a corpus as the nodes of a graph or a hypergraph in which edges depict relationships of lexical similarity between sentences. Such approaches fail to capture semantic similarities between sentences when they express a similar information but have few words in common and are thus lexically dissimilar. To overcome this issue, we propose to extract semantic similarities based on topical representations of sentences. Inspired by the Hierarchical Dirichlet Process, we propose a probabilistic topic model in order to infer topic distributions of sentences. As each topic defines a semantic connection among a group of sentences with a certain degree of membership for each sentence, we propose a fuzzy hypergraph model in which nodes are sentences and fuzzy hyperedges are topics. To produce an informative summary, we extract a set of sentences from the corpus by simultaneously maximizing their relevance to a user-defined query, their centrality in the fuzzy hypergraph and their coverage of topics present in the corpus. We formulate a polynomial time algorithm building on the theory of submodular functions to solve the associated optimization problem. A thorough comparative analysis with other graph-based summarization systems is included in the paper. Our obtained results show the superiority of our method in terms of content coverage of the summaries.


Bayesian Optimization with Directionally Constrained Search

arXiv.org Machine Learning

Bayesian optimization offers a flexible framework to optimize an objective function that is expensive to be evaluated. A Bayesian optimizer iteratively queries the function values on its carefully selected points. Subsequently, it makes a sensible recommendation about where the optimum locates based on its accumulated knowledge. This procedure usually demands a long execution time. In practice, however, there often exists a computational budget or an evaluation limitation allocated to an optimizer, due to the resource scarcity. This constraint demands an optimizer to be aware of its remaining budget and able to spend it wisely, in order to return as better a point as possible. In this paper, we propose a Bayesian optimization approach in this evaluation-limited scenario. Our approach is based on constraining searching directions so as to dedicate the model capability to the most promising area. It could be viewed as a combination of local and global searching policies, which aims at reducing inefficient exploration in the local searching areas, thus making a searching policy more efficient. Experimental studies are conducted on both synthetic and real-world applications. The results demonstrate the superior performance of our newly proposed approach in searching for the optimum within a prescribed evaluation budget.


A Unifying Framework for Variance Reduction Algorithms for Finding Zeroes of Monotone Operators

arXiv.org Machine Learning

A wide range of optimization problems can be recast as monotone inclusion problems. We propose a unifying framework for solving the monotone inclusion problem with randomized Forward-Backward algorithms. Our framework covers many existing deterministic and stochastic algorithms. Under various conditions, we can establish both sublinear and linear convergence rates in expectation for the algorithms covered by this framework. In addition, we consider algorithm design as well as asynchronous randomized Forward algorithms. Numerical experiments demonstrate the worth of the new algorithms that emerge from our framework.


Meta-Model Framework for Surrogate-Based Parameter Estimation in Dynamical Systems

arXiv.org Machine Learning

The central task in modeling complex dynamical systems is parameter estimation. This task involves numerous evaluations of a computationally expensive objective function. Surrogate-based optimization introduces a computationally efficient predictive model that approximates the value of the objective function. The standard approach involves learning a surrogate from training examples that correspond to past evaluations of the objective function. Current surrogate-based optimization methods use static, predefined substitution strategies that decide when to use the surrogate and when the true objective. We introduce a meta-model framework where the substitution strategy is dynamically adapted to the solution space of the given optimization problem. The meta model encapsulates the objective function, the surrogate model and the model of the substitution strategy, as well as components for learning them. The framework can be seamlessly coupled with an arbitrary optimization algorithm without any modification: it replaces the objective function and autonomously decides how to evaluate a given candidate solution. We test the utility of the framework on three tasks of estimating parameters of real-world models of dynamical systems. The results show that the meta model significantly improves the efficiency of optimization, reducing the total number of evaluations of the objective function up to an average of 77%.


Hybrid Planning for Dynamic Multimodal Stochastic Shortest Paths

arXiv.org Artificial Intelligence

Sequential decision problems in applications such as manipulation in warehouses, multi-step meal preparation, and routing in autonomous vehicle networks often involve reasoning about uncertainty, planning over discrete modes as well as continuous states, and reacting to dynamic updates. To formalize such problems generally, we introduce a class of Markov Decision Processes (MDPs) called Dynamic Multimodal Stochastic Shortest Paths (DMSSPs). Much of the work in these domains solves deterministic variants, which can yield poor results when the uncertainty has downstream effects. We develop a Hybrid Stochastic Planning (HSP) algorithm, which uses domain-agnostic abstractions to efficiently unify heuristic search for planning over discrete modes, approximate dynamic programming for stochastic planning over continuous states, and hierarchical interleaved planning and execution.


10 Compelling Machine Learning Dissertations from Ph.D. Students

#artificialintelligence

This dissertation proposes efficient algorithms and provides theoretical analysis through the angle of spectral methods for some important non-convex optimization problems in machine learning. Specifically, the focus is on two types of non-convex optimization problems: learning the parameters of latent variable models and learning in deep neural networks. Learning latent variable models is traditionally framed as a non-convex optimization problem through Maximum Likelihood Estimation (MLE). For some specific models such as multi-view model, it's possible to bypass the non-convexity by leveraging the special model structure and convert the problem into spectral decomposition through Methods of Moments (MM) estimator. In this research, a novel algorithm is proposed that can flexibly learn a multi-view model in a non-parametric fashion.


QoE-Aware Resource Allocation for Crowdsourced Live Streaming: A Machine Learning Approach

arXiv.org Machine Learning

Driven by the tremendous technological advancement of personal devices and the prevalence of wireless mobile network accesses, the world has witnessed an explosion in crowdsourced live streaming. Ensuring a better viewers quality of experience (QoE) is the key to maximize the audiences number and increase streaming providers' profits. This can be achieved by advocating a geo-distributed cloud infrastructure to allocate the multimedia resources as close as possible to viewers, in order to minimize the access delay and video stalls. Moreover, allocating the exact needed resources beforehand avoids over-provisioning, which may lead to significant costs by the service providers. In the contrary, under-provisioning might cause significant delays to the viewers. In this paper, we introduce a prediction driven resource allocation framework, to maximize the QoE of viewers and minimize the resource allocation cost. First, by exploiting the viewers locations available in our unique dataset, we implement a machine learning model to predict the viewers number near each geo-distributed cloud site. Second, based on the predicted results that showed to be close to the actual values, we formulate an optimization problem to proactively allocate resources at the viewers proximity. Additionally, we will present a trade-off between the video access delay and the cost of resource allocation.


Wasserstein Reinforcement Learning

arXiv.org Machine Learning

We propose behavior-driven optimization via Wasserstein distances (WDs) to improve several classes of state-of-the-art reinforcement learning (RL) algorithms. We show that WD regularizers acting on appropriate policy embeddings efficiently incorporate behavioral characteristics into policy optimization. We demonstrate that they improve Evolution Strategy methods by encouraging more efficient exploration, can be applied in imitation learning and to speed up training of Trust Region Policy Optimization methods. Since the exact computation of WDs is expensive, we develop approximate algorithms based on the combination of different methods: dual formulation of the optimal transport problem, alternating optimization and random feature maps, to effectively replace exact WD computations in the RL tasks considered. We provide theoretical analysis of our algorithms and exhaustive empirical evaluation in a variety of RL settings.