Optimization
A Two-Stage Bayesian Optimisation for Automatic Tuning of an Unscented Kalman Filter for Vehicle Sideslip Angle Estimation
Bertipaglia, A., Shyrokau, B., Alirezaei, M., Happee, R.
This paper presents a novel methodology to auto-tune an Unscented Kalman Filter (UKF). It involves using a Two-Stage Bayesian Optimisation (TSBO), based on a t-Student Process to optimise the process noise parameters of a UKF for vehicle sideslip angle estimation. Our method minimises performance metrics, given by the average sum of the states' and measurement' estimation error for various vehicle manoeuvres covering a wide range of vehicle behaviour. The predefined cost function is minimised through a TSBO which aims to find a location in the feasible region that maximises the probability of improving the current best solution. Results on an experimental dataset show the capability to tune the UKF in 79.9% less time than using a genetic algorithm (GA) and the overall capacity to improve the estimation performance in an experimental test dataset of 9.9% to the current state-of-the-art GA.
Off-the-grid learning of sparse mixtures from a continuous dictionary
Butucea, Cristina, Delmas, Jean-Franรงois, Dutfoy, Anne, Hardy, Clรฉment
We consider a general non-linear model where the signal is a finite mixture of an unknown, possibly increasing, number of features issued from a continuous dictionary parameterized by a real nonlinear parameter. The signal is observed with Gaussian (possibly correlated) noise in either a continuous or a discrete setup. We propose an off-the-grid optimization method, that is, a method which does not use any discretization scheme on the parameter space, to estimate both the non-linear parameters of the features and the linear parameters of the mixture. We use recent results on the geometry of off-the-grid methods to give minimal separation on the true underlying non-linear parameters such that interpolating certificate functions can be constructed. Using also tail bounds for suprema of Gaussian processes we bound the prediction error with high probability. Assuming that the certificate functions can be constructed, our prediction error bound is up to log --factors similar to the rates attained by the Lasso predictor in the linear regression model. We also establish convergence rates that quantify with high probability the quality of estimation for both the linear and the non-linear parameters.
High Dimensional Bayesian Optimization with Kernel Principal Component Analysis
Antonov, Kirill, Raponi, Elena, Wang, Hao, Doerr, Carola
Bayesian Optimization (BO) is a surrogate-based global optimization strategy that relies on a Gaussian Process regression (GPR) model to approximate the objective function and an acquisition function to suggest candidate points. It is well-known that BO does not scale well for high-dimensional problems because the GPR model requires substantially more data points to achieve sufficient accuracy and acquisition optimization becomes computationally expensive in high dimensions. Several recent works aim at addressing these issues, e.g., methods that implement online variable selection or conduct the search on a lower-dimensional sub-manifold of the original search space. Advancing our previous work of PCA-BO that learns a linear sub-manifold, this paper proposes a novel kernel PCA-assisted BO (KPCA-BO) algorithm, which embeds a non-linear sub-manifold in the search space and performs BO on this sub-manifold. Intuitively, constructing the GPR model on a lower-dimensional sub-manifold helps improve the modeling accuracy without requiring much more data from the objective function. Also, our approach defines the acquisition function on the lower-dimensional sub-manifold, making the acquisition optimization more manageable. We compare the performance of KPCA-BO to a vanilla BO and to PCA-BO on the multi-modal problems of the COCO/BBOB benchmark suite. Empirical results show that KPCA-BO outperforms BO in terms of convergence speed on most test problems, and this benefit becomes more significant when the dimensionality increases. For the 60D functions, KPCA-BO achieves better results than PCA-BO for many test cases. Compared to the vanilla BO, it efficiently reduces the CPU time required to train the GPR model and to optimize the acquisition function compared to the vanilla BO.
Primal-dual extrapolation methods for monotone inclusions under local Lipschitz continuity with applications to variational inequality, conic constrained saddle point, and convex conic optimization problems
In this paper we consider a class of structured monotone inclusion (MI) problems that consist of finding a zero in the sum of two monotone operators, in which one is maximal monotone while another is locally Lipschitz continuous. In particular, we first propose a primal-dual extrapolation (PDE) method for solving a structured strongly MI problem by modifying the classical forward-backward splitting method by using a point and operator extrapolation technique, in which the parameters are adaptively updated by a backtracking line search scheme. The proposed PDE method is almost parameter-free, equipped with a verifiable termination criterion, and enjoys an operation complexity of ${\cal O}(\log \epsilon^{-1})$, measured by the amount of fundamental operations consisting only of evaluations of one operator and resolvent of another operator, for finding an $\epsilon$-residual solution of the structured strongly MI problem. We then propose another PDE method for solving a structured non-strongly MI problem by applying the above PDE method to approximately solve a sequence of structured strongly MI problems. The resulting PDE method is parameter-free, equipped with a verifiable termination criterion, and enjoys an operation complexity of ${\cal O}(\epsilon^{-1}\log \epsilon^{-1})$ for finding an $\epsilon$-residual solution of the structured non-strongly MI problem. As a consequence, we apply the latter PDE method to convex conic optimization, conic constrained saddle point, and variational inequality problems, and obtain complexity results for finding an $\epsilon$-KKT or $\epsilon$-residual solution of them under local Lipschitz continuity. To the best of our knowledge, no prior studies were conducted to investigate methods with complexity guarantees for solving the aforementioned problems under local Lipschitz continuity. All the complexity results obtained in this paper are entirely new.
Approximating 1-Wasserstein Distance with Trees
Yamada, Makoto, Takezawa, Yuki, Sato, Ryoma, Bao, Han, Kozareva, Zornitsa, Ravi, Sujith
Wasserstein distance, which measures the discrepancy between distributions, shows efficacy in various types of natural language processing (NLP) and computer vision (CV) applications. One of the challenges in estimating Wasserstein distance is that it is computationally expensive and does not scale well for many distribution comparison tasks. In this paper, we aim to approximate the 1-Wasserstein distance by the tree-Wasserstein distance (TWD), where TWD is a 1-Wasserstein distance with tree-based embedding and can be computed in linear time with respect to the number of nodes on a tree. More specifically, we propose a simple yet efficient L1-regularized approach to learning the weights of the edges in a tree. To this end, we first show that the 1-Wasserstein approximation problem can be formulated as a distance approximation problem using the shortest path distance on a tree. We then show that the shortest path distance can be represented by a linear model and can be formulated as a Lasso-based regression problem. Owing to the convex formulation, we can obtain a globally optimal solution efficiently. Moreover, we propose a tree-sliced variant of these methods. Through experiments, we demonstrated that the weighted TWD can accurately approximate the original 1-Wasserstein distance.
Model-Based Disturbance Estimation for a Fiber-Reinforced Soft Manipulator using Orientation Sensing
Cangan, Barnabas Gavin, Navarro, Stefan Escaida, Yang, Bai, Zhang, Yu, Duriez, Christian, Katzschmann, Robert K.
For soft robots to work effectively in human-centered environments, they need to be able to estimate their state and external interactions based on (proprioceptive) sensors. Estimating disturbances allows a soft robot to perform desirable force control. Even in the case of rigid manipulators, force estimation at the end-effector is seen as a non-trivial problem. And indeed, other current approaches to address this challenge have shortcomings that prevent their general application. They are often based on simplified soft dynamic models, such as the ones relying on a piece-wise constant curvature (PCC) approximation or matched rigid-body models that do not represent enough details of the problem. Thus, the applications needed for complex human-robot interaction can not be built. Finite element methods (FEM) allow for predictions of soft robot dynamics in a more generic fashion. Here, using the soft robot modeling capabilities of the framework SOFA, we build a detailed FEM model of a multi-segment soft continuum robotic arm composed of compliant deformable materials and fiber-reinforced pressurized actuation chambers with a model for sensors that provide orientation output. This model is used to establish a state observer for the manipulator. Model parameters were calibrated to match imperfections of the manual fabrication process using physical experiments. We then solve a quadratic programming inverse dynamics problem to compute the components of external force that explain the pose error. Our experiments show an average force estimation error of around 1.2%. As the methods proposed are generic, these results are encouraging for the task of building soft robots exhibiting complex, reactive, sensor-based behavior that can be deployed in human-centered environments.
Engineers devise a recipe for improving any autonomous robotic system
Each of these robotic systems is a product of an ad hoc design process specific to that particular system. In designing an autonomous robot, engineers must run countless trial-and-error simulations, often informed by intuition. These simulations are tailored to a particular robot's components and tasks, in order to tune and optimize its performance. In some respects, designing an autonomous robot today is like baking a cake from scratch, with no recipe or prepared mix to ensure a successful outcome. Now, MIT engineers have developed a general design tool for roboticists to use as a sort of automated recipe for success.
Decentralized Gossip-Based Stochastic Bilevel Optimization over Communication Networks
Yang, Shuoguang, Zhang, Xuezhou, Wang, Mengdi
Bilevel optimization have gained growing interests, with numerous applications found in meta learning, minimax games, reinforcement learning, and nested composition optimization. This paper studies the problem of distributed bilevel optimization over a network where agents can only communicate with neighbors, including examples from multi-task, multi-agent learning and federated learning. In this paper, we propose a gossip-based distributed bilevel learning algorithm that allows networked agents to solve both the inner and outer optimization problems in a single timescale and share information via network propagation. We show that our algorithm enjoys the $\mathcal{O}(\frac{1}{K \epsilon^2})$ per-agent sample complexity for general nonconvex bilevel optimization and $\mathcal{O}(\frac{1}{K \epsilon})$ for strongly convex objective, achieving a speedup that scales linearly with the network size. The sample complexities are optimal in both $\epsilon$ and $K$. We test our algorithm on the examples of hyperparameter tuning and decentralized reinforcement learning. Simulated experiments confirmed that our algorithm achieves the state-of-the-art training efficiency and test accuracy.
Dynamic Reserve Price Design for Lazada Sponsored Search
In ecommerce platform, users will be less likely to use organic search if sponsored search shows them unexpected advertising items, which will be a hidden cost for the platform. In order to incorporate the hidden cost into auction mechanism which helps create positive growth for the platform, we turn to a reserve price design to decide whether we sell the traffic, as well as build healthy relationships between revenue and user experience. We propose a dynamic reserve price design framework to sell traffic more efficiently with minimal cost of user experience while keeping long term incentives to the advertisers to reveal their valuations truthfully. A distributed algorithm is also proposed to compute the reserve price with billion scale data in the production environment. Experiments with offline evaluations and online AB testing demonstrate that it is a simple and efficient method to be suitably used in industrial production. It has already been fully deployed in the production of Lazada sponsored search.
Efficient Inference of Spatially-varying Gaussian Markov Random Fields with Applications in Gene Regulatory Networks
Ravikumar, Visweswaran, Xu, Tong, Al-Holou, Wajd N., Fattahi, Salar, Rao, Arvind
In this paper, we study the problem of inferring spatially-varying Gaussian Markov random fields (SV-GMRF) where the goal is to learn a network of sparse, context-specific GMRFs representing network relationships between genes. An important application of SV-GMRFs is in inference of gene regulatory networks from spatially-resolved transcriptomics datasets. The current work on inference of SV-GMRFs are based on the regularized maximum likelihood estimation (MLE) and suffer from overwhelmingly high computational cost due to their highly nonlinear nature. To alleviate this challenge, we propose a simple and efficient optimization problem in lieu of MLE that comes equipped with strong statistical and computational guarantees. Our proposed optimization problem is extremely efficient in practice: we can solve instances of SV-GMRFs with more than 2 million variables in less than 2 minutes. We apply the developed framework to study how gene regulatory networks in Glioblastoma are spatially rewired within tissue, and identify prominent activity of the transcription factor HES4 and ribosomal proteins as characterizing the gene expression network in the tumor peri-vascular niche that is known to harbor treatment resistant stem cells.