Optimization
On Sample Complexity of Projection-Free Primal-Dual Methods for Learning Mixture Policies in Markov Decision Processes
Khuzani, Masoud Badiei, Vasudevan, Varun, Ren, Hongyi, Xing, Lei
We study the problem of learning policy of an infinite-horizon, discounted cost, Markov decision process (MDP) with a large number of states. We compute the actions of a policy that is nearly as good as a policy chosen by a suitable oracle from a given mixture policy class characterized by the convex hull of a set of known base policies. To learn the coefficients of the mixture model, we recast the problem as an approximate linear programming (ALP) formulation for MDPs, where the feature vectors correspond to the occupation measures of the base policies defined on the state-action space. We then propose a projection-free stochastic primal-dual method with the Bregman divergence to solve the characterized ALP. Furthermore, we analyze the probably approximately correct (PAC) sample complexity of the proposed stochastic algorithm, namely the number of queries required to achieve near optimal objective value. We also propose a modification of our proposed algorithm with the polytope constraint sampling for the smoothed ALP, where the restriction to lower bounding approximations are relaxed. In addition, we apply the proposed algorithms to a queuing problem, and compare their performance with a penalty function algorithm. The numerical results illustrates that the primal-dual achieves better efficiency and low variance across different trials compared to the penalty function method.
The importance of better models in stochastic optimization
Standard stochastic optimization methods are brittle, sensitive to stepsize choices and other algorithmic parameters, and they exhibit instability outside of well-behaved families of objectives. To address these challenges, we investigate models for stochastic minimization and learning problems that exhibit better robustness to problem families and algorithmic parameters. With appropriately accurate models---which we call the aProx family---stochastic methods can be made stable, provably convergent and asymptotically optimal; even modeling that the objective is nonnegative is sufficient for this stability. We extend these results beyond convexity to weakly convex objectives, which include compositions of convex losses with smooth functions common in modern machine learning applications. We highlight the importance of robustness and accurate modeling with a careful experimental evaluation of convergence time and algorithm sensitivity.
A Learning Framework for Distribution-Based Game-Theoretic Solution Concepts
The past few years have seen several works establishing PAC frameworks for solving various problems in economic domains; these include optimal auction design, approximate optima of submodular functions, stable partitions and payoff divisions in cooperative games and more. In this work, we provide a unified learning-theoretic methodology for modeling these problems, and establish some useful tools for determining whether a given economic solution concept can be learned from data. Our learning theoretic framework generalizes a notion of function space dimension --- the graph dimension --- adapting it to the solution concept learning domain. We identify sufficient conditions for the PAC learnability of solution concepts, and show that results in existing works can be immediately derived using our general methodology. Finally, we apply our methods in other economic domains, yielding a novel notion of PAC competitive equilibrium and PAC Condorcet winners.
Convergence Analysis of Inexact Randomized Iterative Methods
Loizou, Nicolas, Richtรกrik, Peter
In this paper we present a convergence rate analysis of inexact variants of several randomized iterative methods. Among the methods studied are: stochastic gradient descent, stochastic Newton, stochastic proximal point and stochastic subspace ascent. A common feature of these methods is that in their update rule a certain sub-problem needs to be solved exactly. We relax this requirement by allowing for the sub-problem to be solved inexactly. In particular, we propose and analyze inexact randomized iterative methods for solving three closely related problems: a convex stochastic quadratic optimization problem, a best approximation problem and its dual, a concave quadratic maximization problem. We provide iteration complexity results under several assumptions on the inexactness error. Inexact variants of many popular and some more exotic methods, including randomized block Kaczmarz, randomized Gaussian Kaczmarz and randomized block coordinate descent, can be cast as special cases. Numerical experiments demonstrate the benefits of allowing inexactness.
Low-rank approximations of hyperbolic embeddings
Jawanpuria, Pratik, Meghwanshi, Mayank, Mishra, Bamdev
The hyperbolic manifold is a smooth manifold of negative constant curvature. While the hyperbolic manifold is well-studied in the literature, it has gained interest in the machine learning and natural language processing communities lately due to its usefulness in modeling continuous hierarchies. Tasks with hierarchical structures are ubiquitous in those fields and there is a general interest to learning hyperbolic representations or embeddings of such tasks. Additionally, these embeddings of related tasks may also share a low-rank subspace. In this work, we propose to learn hyperbolic embeddings such that they also lie in a low-dimensional subspace. In particular, we consider the problem of learning a low-rank factorization of hyperbolic embeddings. We cast these problems as manifold optimization problems and propose computationally efficient algorithms. Empirical results illustrate the efficacy of the proposed approach.
Intelligent Solution System towards Parts Logistics Optimization
Huang, Yaoting, Chen, Boyu, Lu, Wenlian, Jin, Zhong-Xiao, Zheng, Ren
Due to the complication of the presented problem, intelligent algorithms show great power to solve the parts logistics optimization problem related to the vehicle routing problem (VRP). However, most of the existing research to VRP are incomprehensive and failed to solve a real-work parts logistics problem. In this work, towards SAIC logistics problem, we propose a systematic solution to this 2-Dimensional Loading Capacitated Multi-Depot Heterogeneous VRP with Time Windows by integrating diverse types of intelligent algorithms, including, a heuristic algorithm to initialize feasible logistics planning schemes by imitating manual planning, the core Tabu Search algorithm for global optimization, accelerated by a novel bundle technique, heuristically algorithms for routing, packing and queuing associated, and a heuristic post-optimization process to promote the optimal solution. Based on these algorithms, the SAIC Motor has successfully established an intelligent management system to give a systematic solution for the parts logistics planning, superior than manual planning in its performance, customizability and expandability.
Gaussian Process Optimization with Adaptive Sketching: Scalable and No Regret
Calandriello, Daniele, Carratino, Luigi, Lazaric, Alessandro, Valko, Michal, Rosasco, Lorenzo
Gaussian processes (GP) are a popular Bayesian approach for the optimization of black-box functions. Despite their effectiveness in simple problems, GP-based algorithms hardly scale to complex high-dimensional functions, as their per-iteration time and space cost is at least quadratic in the number of dimensions $d$ and iterations $t$. Given a set of $A$ alternative to choose from, the overall runtime $O(t^3A)$ quickly becomes prohibitive. In this paper, we introduce BKB (budgeted kernelized bandit), a novel approximate GP algorithm for optimization under bandit feedback that achieves near-optimal regret (and hence near-optimal convergence rate) with near-constant per-iteration complexity and no assumption on the input space or covariance of the GP. Combining a kernelized linear bandit algorithm (GP-UCB) with randomized matrix sketching technique (i.e., leverage score sampling), we prove that selecting inducing points based on their posterior variance gives an accurate low-rank approximation of the GP, preserving variance estimates and confidence intervals. As a consequence, BKB does not suffer from variance starvation, an important problem faced by many previous sparse GP approximations. Moreover, we show that our procedure selects at most $\tilde{O}(d_{eff})$ points, where $d_{eff}$ is the effective dimension of the explored space, which is typically much smaller than both $d$ and $t$. This greatly reduces the dimensionality of the problem, thus leading to a $O(TAd_{eff}^2)$ runtime and $O(A d_{eff})$ space complexity.
Iterated two-phase local search for the Set-Union Knapsack Problem
The Set-union Knapsack Problem (SUKP) is a generalization of the popular 0-1 knapsack problem. Given a set of weighted elements and a set of items with profits where each item is composed of a subset of elements, the SUKP involves packing a subset of items in a capacity-constrained knapsack such that the total profit of the selected items is maximized while their weights do not exceed the knapsack capacity. In this work, we present an effective iterated two-phase local search algorithm for this NP-hard combinatorial optimization problem. The proposed algorithm iterates through two search phases: a local optima exploration phase that alternates between a variable neighborhood descent search and a tabu search to explore local optimal solutions, and a local optima escaping phase to drive the search to unexplored regions. We show the competitiveness of the algorithm compared to the state-of-the-art methods in the literature. Specifically, the algorithm discovers 18 improved best results (new lower bounds) for the 30 benchmark instances and matches the best-known results for the 12 remaining instances. We also report the first computational results with the general CPLEX solver, including 6 proven optimal solutions. Finally, we investigate the effectiveness of the key ingredients of the algorithm on its performance.
Adaptive Sample-Efficient Blackbox Optimization via ES-active Subspaces
Choromanski, Krzysztof, Pacchiano, Aldo, Parker-Holder, Jack, Tang, Yunhao
We present a new algorithm ASEBO for conducting optimization of high-dimensional blackbox functions. ASEBO adapts to the geometry of the function and learns optimal sets of sensing directions, which are used to probe it, on-the-fly. It addresses the exploration-exploitation trade-off of blackbox optimization, where each single function query is expensive, by continuously learning the bias of the lower-dimensional model used to approximate gradients of smoothings of the function with compressed sensing and contextual bandits methods. To obtain this model, it uses techniques from the emerging theory of active subspaces in the novel ES blackbox optimization context. As a result, ASEBO learns the dynamically changing intrinsic dimensionality of the gradient space and adapts to the hardness of different stages of the optimization without external supervision. Consequently, it leads to more sample-efficient blackbox optimization than state-of-the-art algorithms. We provide rigorous theoretical justification of the effectiveness of our method. We also empirically evaluate it on the set of reinforcement learning policy optimization tasks as well as functions from the recently open-sourced Nevergrad library, demonstrating that it consistently learns optimal inputs with fewer queries to a blackbox function than other methods.
Financial Applications of Gaussian Processes and Bayesian Optimization
Gonzalvez, Joan, Lezmi, Edmond, Roncalli, Thierry, Xu, Jiali
In the last five years, the financial industry has been impacted by the emergence of digitalization and machine learning. In this article, we explore two methods that have undergone rapid development in recent years: Gaussian processes and Bayesian optimization. Gaussian processes can be seen as a generalization of Gaussian random vectors and are associated with the development of kernel methods. Bayesian optimization is an approach for performing derivative-free global optimization in a small dimension, and uses Gaussian processes to locate the global maximum of a black-box function. The first part of the article reviews these two tools and shows how they are connected. In particular, we focus on the Gaussian process regression, which is the core of Bayesian machine learning, and the issue of hyperparameter selection. The second part is dedicated to two financial applications. We first consider the modeling of the term structure of interest rates. More precisely, we test the fitting method and compare the GP prediction and the random walk model. The second application is the construction of trend-following strategies, in particular the online estimation of trend and covariance windows.