Optimization
New Computational and Statistical Aspects of Regularized Regression with Application to Rare Feature Selection and Aggregation
Jalali, Amin, Javanmard, Adel, Fazel, Maryam
A large portion of estimation procedures in high-dimensional statistics and machine learning have been designed based on principles and methods in continuous optimization. In this pursuit, incorporating prior knowledge on the target model, often presented as discrete and combinatorial descriptions, has been of interest in the past decade. Aside from many individual cases that have been studied in the literature, a number of general frameworks have been proposed. For example, [Bach et al., 2013, Obozinski and Bach, 2016] define sparsity-related norms and their associated optimization tools from support-based monotone set functions. On the other hand, several unifications have been proposed for the purpose of providing estimation and recovery guarantees. A et al., 2012] which connects the success ofwell-known example is the work of [Chandrasekaran norm minimization in model recovery given random linear measurements to the notion of Gaussian width [Gordon, 1988]. However, many of the final results of these frameworks (excluding discrete et al., 2013]) are quantities that are hard to compute; even evaluating theapproaches such as [Bach norm. Therefore, many a time computational aspects of these norms and their associated quantities are treated on a case by case basis. In fact, a unified framework for turning discrete descriptions into continuous tools for estimation, that 1) provides a computational suite of optimization tools, and 2) is amenable to statistical analysis, is largely underdeveloped.
The Convergence of Iterative Delegations in Liquid Democracy in a Social Network
Escoffier, Bruno, Gilbert, Hugo, Pass-Lanneau, Adรจle
Liquid democracy is a collective decision making paradigm which lies between direct and representative democracy. One of its main features is that voters can delegate their votes in a transitive manner such that: A delegates to B and B delegates to C leads to A indirectly delegates to C. These delegations can be effectively empowered by implementing liquid democracy in a social network, so that voters can delegate their votes to any of their neighbors in the network. However, it is uncertain that such a delegation process will lead to a stable state where all voters are satisfied with the people representing them. We study the stability (w.r.t. voters preferences) of the delegation process in liquid democracy and model it as a game in which the players are the voters and the strategies are their possible delegations. We answer several questions on the equilibria of this process in any social network or in social networks that correspond to restricted types of graphs. We show that a Nash-equilibrium may not exist, and that it is even NP-complete to decide whether one exists or not. This holds even if the social network is a complete graph or a bounded degree graph. We further show that this existence problem is W[1]-hard w.r.t. the treewidth of the social network. Besides these hardness results, we demonstrate that an equilibrium always exists whatever the preferences of the voters iff the social network is a tree. We design a dynamic programming procedure to determine some desirable equilibria (e.g., minimizing the dissatisfaction of the voters) in polynomial time for tree social networks. Lastly, we study the convergence of delegation dynamics. Unfortunately, when an equilibrium exists, we show that a best response dynamics may not converge, even if the social network is a path or a complete graph.
ReinBo: Machine Learning pipeline search and configuration with Bayesian Optimization embedded Reinforcement Learning
Sun, Xudong, Lin, Jiali, Bischl, Bernd
Machine learning pipeline potentially consists of several stages of operations like data preprocessing, feature engineering and machine learning model training. Each operation has a set of hyper-parameters, which can become irrelevant for the pipeline when the operation is not selected. This gives rise to a hierarchical conditional hyper-parameter space. To optimize this mixed continuous and discrete conditional hierarchical hyper-parameter space, we propose an efficient pipeline search and configuration algorithm which combines the power of Reinforcement Learning and Bayesian Optimization. Empirical results show that our method performs favorably compared to state of the art methods like Auto-sklearn , TPOT, Tree Parzen Window, and Random Search.
Differential Dynamic Programming for Multi-Phase Rigid Contact Dynamics
Budhiraja, Rohan, Carpentier, Justin, Mastalli, Carlos, Mansard, Nicolas
A common strategy today to generate efficient locomotion movements is to split the problem into two consecutive steps: the first one generates the contact sequence together with the centroidal trajectory, while the second one computes the whole-body trajectory that follows the centroidal pattern. Yet the second step is generally handled by a simple program such as an inverse kinematics solver. In contrast, we propose to compute the whole-body trajectory by using a local optimal control solver, namely Differential Dynamic Programming (DDP). Our method produces more efficient motions, with lower forces and smaller impacts, by exploiting the Angular Momentum (AM). With this aim, we propose an original DDP formulation exploiting the Karush-Kuhn-Tucker constraint of the rigid contact model. We experimentally show the importance of this approach by executing large steps walking on the real HRP-2 robot, and by solving the problem of attitude control under the absence of external forces.
fmfn/BayesianOptimization
This is a constrained global optimization package built upon bayesian inference and gaussian process, that attempts to find the maximum value of an unknown function in as few iterations as possible. This technique is particularly suited for optimization of high cost functions, situations where the balance between exploration and exploitation is important. With the release of version 1.0.0 a number of API breaking changes were introduced. I understand this can be a headache for some, but these were necessary changes that needed to be done and ultimately made the package better. If you have used this package in the past I suggest you take the basic and advanced tours (found in the examples folder) in order to familiarize yourself with the new API.
Artificial Intelligence I: Basics and Games in Java
This course is about the fundamental concepts of artificial intelligence. This topic is getting very hot nowadays because these learning algorithms can be used in several fields from software engineering to investment banking. Learning algorithms can recognize patterns which can help detecting cancer for example. We may construct algorithms that can have a very good guess about stock price movement in the market. In the first chapter we are going to talk about the basic graph algorithms.
Simultaneous Contact, Gait and Motion Planning for Robust Multi-Legged Locomotion via Mixed-Integer Convex Optimization
Aceituno-Cabezas, Bernardo, Mastalli, Carlos, Dai, Hongkai, Focchi, Michele, Radulescu, Andreea, Caldwell, Darwin G., Cappelletto, Jose, Grieco, Juan C., Fernandez-Lopez, Gerardo, Semini, Claudio
Traditional motion planning approaches for multi-legged locomotion divide the problem into several stages, such as contact search and trajectory generation. However, reasoning about contacts and motions simultaneously is crucial for the generation of complex whole-body behaviors. Currently, coupling theses problems has required either the assumption of a fixed gait sequence and flat terrain condition, or non-convex optimization with intractable computation time. In this paper, we propose a mixed-integer convex formulation to plan simultaneously contact locations, gait transitions and motion, in a computationally efficient fashion. In contrast to previous works, our approach is not limited to flat terrain nor to a pre-specified gait sequence. Instead, we incorporate the friction cone stability margin, approximate the robot's torque limits, and plan the gait using mixed-integer convex constraints. We experimentally validated our approach on the HyQ robot by traversing different challenging terrains, where non-convexity and flat terrain assumptions might lead to sub-optimal or unstable plans. Our method increases the motion generality while keeping a low computation time.
Meta-Learning Acquisition Functions for Bayesian Optimization
Volpp, Michael, Frรถhlich, Lukas, Doerr, Andreas, Hutter, Frank, Daniel, Christian
Many practical applications of machine learning require data-efficient black-box function optimization, e.g., to identify hyperparameters or process settings. However, readily available algorithms are typically designed to be universal optimizers and are, thus, often suboptimal for specific tasks. We therefore propose a method to learn optimizers which are automatically adapted to a given class of objective functions, e.g., in the context of sim-to-real applications. Instead of learning optimization from scratch, the proposed approach is firmly based within the famous Bayesian optimization framework. Only the acquisition function (AF) is replaced by a learned neural network and therefore the resulting algorithm is still able to exploit the proven generalization capabilities of Gaussian processes. We present experiments on several simulated as well as on a sim-to-real transfer task. The results show that the learned optimizers (1) consistently perform better than or on-par with known AFs on general function classes and (2) can automatically identify structural properties of a function class using cheap simulations and transfer this knowledge to adapt rapidly to real hardware tasks, thereby significantly outperforming existing problem-agnostic AFs.
Generalized active learning and design of statistical experiments for manifold-valued data
In computer graphics and computer vision, usually either physically inspired analytic reflectance models, like Cook and Torrance (1981) or He et al. (1991), or parametric reflectance models chosen via qualitative criteria, like Phong (1975), or Lafortune et al. (1997), are used to model BRDFs. These BRDF models are only crude approximations of the reflectance of real materials. In multidimensional reflectometry, an alternative approach is usually taken. One directly measures values of the BRDF for different combinations of the incoming and outgoing angles and then fits the measured data to a selected analytic model using optimization techniques. There were numerous efforts to use modern machine learning techniques to construct datadriven BRDF models. Brady et al. (2014) proposed a method to generate new analytical BRDFs using a heuristic distance-based search procedure called Genetic Programming. In Brochu et al. (2008), an active learning algorithm using discrete perceptional data was developed and applied to learning parameters of BRDF models such as the Ashikhmin - Shirley model Ashikhmin and Shirley (2000), while Langovoy et al. (2016) treated active learning for the Cook - Torrance model Cook and Torrance (1981). Analysis of BRDF data with statistical and machine learning methods was discussed in Langovoy (2015b), Langovoy (2015a), Sole et al. (2018), Doctor and Byers (2018).
Quantifying Interpretability of Arbitrary Machine Learning Models Through Functional Decomposition
Molnar, Christoph, Casalicchio, Giuseppe, Bischl, Bernd
To obtain interpretable machine learning models, either interpretable models are constructed from the outset - e.g. shallow decision trees, rule lists, or sparse generalized linear models - or post-hoc interpretation methods - e.g. partial dependence or ALE plots - are employed. Both approaches have disadvantages. While the former can restrict the hypothesis space too conservatively, leading to potentially suboptimal solutions, the latter can produce too verbose or misleading results if the resulting model is too complex, especially w.r.t. feature interactions. We propose to make the compromise between predictive power and interpretability explicit by quantifying the complexity / interpretability of machine learning models. Based on functional decomposition, we propose measures of number of features used, interaction strength and main effect complexity. We show that post-hoc interpretation of models that minimize the three measures becomes more reliable and compact. Furthermore, we demonstrate the application of such measures in a multi-objective optimization approach which considers predictive power and interpretability at the same time.