Optimization
Bayesian Optimization for Multi-objective Optimization and Multi-point Search
Bayesian optimization is an effective method to efficiently optimize unknown objective functions with high evaluation costs. Traditional Bayesian optimization algorithms select one point per iteration for single objective function, whereas in recent years, Bayesian optimization for multi-objective optimization or multi-point search per iteration have been proposed. However, Bayesian optimization that can deal with them at the same time in non-heuristic way is not known at present. We propose a Bayesian optimization algorithm that can deal with multi-objective optimization and multi-point search at the same time. First, we define an acquisition function that considers both multi-objective and multi-point search problems. It is difficult to analytically maximize the acquisition function as the computational cost is prohibitive even when approximate calculations such as sampling approximation are performed; therefore, we propose an accurate and computationally efficient method for estimating gradient of the acquisition function, and develop an algorithm for Bayesian optimization with multi-objective and multi-point search. It is shown via numerical experiments that the performance of the proposed method is comparable or superior to those of heuristic methods.
Foundations of Machine Learning: Part 3 - DZone AI
This post is the seventh one of our series on the history and foundations of econometric and machine learning models. The first four were on econometrics techniques. Part 6 is online here. As we have seen before, modeling here is based on solving an optimization problem, and solving the problem described by equation (6) is all the more complex because the functional space M is large. The idea of boosting, as introduced by Shapire and Freund (2012), is to learn, slowly, from the errors of the model, in an iterative way.
Model-Free Reinforcement Learning for Financial Portfolios: A Brief Survey
Financial portfolio management is one of the problems that are most frequently encountered in the investment industry. Nevertheless, it is not widely recognized that both Kelly Criterion and Risk Parity collapse into Mean Variance under some conditions, which implies that a universal solution to the portfolio optimization problem could potentially exist. In fact, the process of sequential computation of optimal component weights that maximize the portfolio's expected return subject to a certain risk budget can be reformulated as a discrete-time Markov Decision Process (MDP) and hence as a stochastic optimal control, where the system being controlled is a portfolio consisting of multiple investment components, and the control is its component weights. Consequently, the problem could be solved using model-free Reinforcement Learning (RL) without knowing specific component dynamics. By examining existing methods of both value-based and policy-based model-free RL for the portfolio optimization problem, we identify some of the key unresolved questions and difficulties facing today's portfolio managers of applying model-free RL to their investment portfolios.
Data-efficient Learning of Morphology and Controller for a Microrobot
Liao, Thomas, Wang, Grant, Yang, Brian, Lee, Rene, Pister, Kristofer, Levine, Sergey, Calandra, Roberto
Robot design is often a slow and difficult process requiring the iterative construction and testing of prototypes, with the goal of sequentially optimizing the design. For most robots, this process is further complicated by the need, when validating the capabilities of the hardware to solve the desired task, to already have an appropriate controller, which is in turn designed and tuned for the specific hardware. In this paper, we propose a novel approach, HPC-BBO, to efficiently and automatically design hardware configurations, and evaluate them by also automatically tuning the corresponding controller. HPC-BBO is based on a hierarchical Bayesian optimization process which iteratively optimizes morphology configurations (based on the performance of the previous designs during the controller learning process) and subsequently learns the corresponding controllers (exploiting the knowledge collected from optimizing for previous morphologies). Moreover, HPC-BBO can select a "batch" of multiple morphology designs at once, thus parallelizing hardware validation and reducing the number of time-consuming production cycles. We validate HPC-BBO on the design of the morphology and controller for a simulated 6-legged microrobot. Experimental results show that HPC-BBO outperforms multiple competitive baselines, and yields a $360\%$ reduction in production cycles over standard Bayesian optimization, thus reducing the hypothetical manufacturing time of our microrobot from 21 to 4 months.
Information asymmetry in KL-regularized RL
Galashov, Alexandre, Jayakumar, Siddhant M., Hasenclever, Leonard, Tirumala, Dhruva, Schwarz, Jonathan, Desjardins, Guillaume, Czarnecki, Wojciech M., Teh, Yee Whye, Pascanu, Razvan, Heess, Nicolas
Many real world tasks exhibit rich structure that is repeated across different parts of the state space or in time. In this work we study the possibility of leveraging such repeated structure to speed up and regularize learning. We start from the KL regularized expected reward objective which introduces an additional component, a default policy. Instead of relying on a fixed default policy, we learn it from data. But crucially, we restrict the amount of information the default policy receives, forcing it to learn reusable behaviours that help the policy learn faster. We formalize this strategy and discuss connections to information bottleneck approaches and to the variational EM algorithm. We present empirical results in both discrete and continuous action domains and demonstrate that, for certain tasks, learning a default policy alongside the policy can significantly speed up and improve learning.
Efficient Discrete Supervised Hashing for Large-scale Cross-modal Retrieval
Yao, Tao, Kong, Xiangwei, Yan, Lianshan, Tang, Wenjing, Tian, Qi
Supervised cross-modal hashing has gained increasing research interest on large-scale retrieval task owning to its satisfactory performance and efficiency. However, it still has some challenging issues to be further studied: 1) most of them fail to well preserve the semantic correlations in hash codes because of the large heterogenous gap; 2) most of them relax the discrete constraint on hash codes, leading to large quantization error and consequent low performance; 3) most of them suffer from relatively high memory cost and computational complexity during training procedure, which makes them unscalable. In this paper, to address above issues, we propose a supervised cross-modal hashing method based on matrix factorization dubbed Efficient Discrete Supervised Hashing (EDSH). Specifically, collective matrix factorization on heterogenous features and semantic embedding with class labels are seamlessly integrated to learn hash codes. Therefore, the feature based similarities and semantic correlations can be both preserved in hash codes, which makes the learned hash codes more discriminative. Then an efficient discrete optimal algorithm is proposed to handle the scalable issue. Instead of learning hash codes bit-by-bit, hash codes matrix can be obtained directly which is more efficient. Extensive experimental results on three public real-world datasets demonstrate that EDSH produces a superior performance in both accuracy and scalability over some existing cross-modal hashing methods.
A tutorial on recursive models for analyzing and predicting path choice behavior
Zimmermann, Maëlle, Frejinger, Emma
The problem at the heart of this tutorial consists in modeling the path choice behavior of network users. This problem has extensively been studied in transportation science and econometrics, where it is known as the route choice problem. In this literature, individuals' choice of paths are typically predicted from discrete choice models. The aim of this tutorial is to present this problem from the novel and more general perspective of inverse optimization, in order to describe the modeling approaches proposed in related research areas and thereby motivate the use of so-called recursive models. The latter have the advantage of predicting path choices without generating choice sets. In this paper, we contextualize discrete choice models as a probabilistic approach to an inverse shortest path problem with noisy data, highlighting that recursive discrete choice models in particular originate from viewing the inner shortest path problem as a parametric Markov Decision Process. We also illustrate through simple numerical examples that recursive models overcome issues associated with the path-based discrete choice models commonly found in the transportation literature.
Automated Machine Learning via ADMM
Liu, Sijia, Ram, Parikshit, Bouneffouf, Djallel, Bramble, Gregory, Conn, Andrew R, Samulowitz, Horst, Gray, Alexander
We study the automated machine learning (AutoML) problem of jointly selecting appropriate algorithms from an algorithm portfolio as well as optimizing their hyper-parameters for certain learning tasks. The main challenges include a) the coupling between algorithm selection and hyper-parameter optimization (HPO), and b) the black-box optimization nature of the problem where the optimizer cannot access the gradients of the loss function but may query function values. To circumvent these difficulties, we propose a new AutoML framework by leveraging the alternating direction method of multipliers (ADMM) scheme. Due to the splitting properties of ADMM, algorithm selection and HPO can be decomposed through the augmented Lagrangian function. As a result, HPO with mixed continuous and integer constraints are efficiently handled through a query-efficient Bayesian optimization approach and Euclidean projection operator that yields a closed-form solution. Algorithm selection in ADMM is naturally interpreted as a combinatorial bandit problem. The effectiveness of our proposed methodology is compared to state-of-the-art AutoML schemes such as TPOT and Auto-sklearn on numerous benchmark data sets.
Using Collective Behavior of Coupled Oscillators for Solving DCOP
Leite, Allan R., Enembreck, Fabricio
The distributed constraint optimization problem (DCOP) has emerged as one of the most promising coordination techniques in multiagent systems. However, because DCOP is known to be NP-hard, the existing DCOP techniques are often unsuitable for large-scale applications, which require distributed and scalable algorithms to deal with severely limited computing and communication. In this paper, we present a novel approach to provide approximate solutions for large-scale, complex DCOPs. This approach introduces concepts of synchronization of coupled oscillators for speeding up the convergence process towards high-quality solutions. We propose a new anytime local search DCOP algorithm, called Coupled Oscillator OPTimization (COOPT), which amounts to iteratively solving a DCOP by agents exchanging local information that brings them to a consensus. We empirically evaluate COOPT on constraint networks involving hundreds of variables with different topologies, domains, and densities. Our experimental results demonstrate that COOPT outperforms other incomplete state-of-the-art DCOP algorithms, especially in terms of the agents' communication cost and solution quality.
Categorical Feature Compression via Submodular Optimization
Bateni, MohammadHossein, Chen, Lin, Esfandiari, Hossein, Fu, Thomas, Mirrokni, Vahab S., Rostamizadeh, Afshin
In the era of big data, learning from categorical features with very large vocabularies (e.g., 28 million for the Criteo click prediction dataset) has become a practical challenge for machine learning researchers and practitioners. We design a highly-scalable vocabulary compression algorithm that seeks to maximize the mutual information between the compressed categorical feature and the target binary labels and we furthermore show that its solution is guaranteed to be within a $1-1/e \approx 63\%$ factor of the global optimal solution. To achieve this, we introduce a novel re-parametrization of the mutual information objective, which we prove is submodular, and design a data structure to query the submodular function in amortized $O(\log n )$ time (where $n$ is the input vocabulary size). Our complete algorithm is shown to operate in $O(n \log n )$ time. Additionally, we design a distributed implementation in which the query data structure is decomposed across $O(k)$ machines such that each machine only requires $O(\frac n k)$ space, while still preserving the approximation guarantee and using only logarithmic rounds of computation. We also provide analysis of simple alternative heuristic compression methods to demonstrate they cannot achieve any approximation guarantee. Using the large-scale Criteo learning task, we demonstrate better performance in retaining mutual information and also verify competitive learning performance compared to other baseline methods.