Goto

Collaborating Authors

 Optimization


Baidu Apollo EM Motion Planner

arXiv.org Artificial Intelligence

In this manuscript, we introduce a real-time motion planning system based on the Baidu Apollo (open source) autonomous driving platform. The developed system aims to address the industrial level-4 motion planning problem while considering safety, comfort and scalability. The system covers multilane and single-lane autonomous driving in a hierarchical manner: (1) The top layer of the system is a multilane strategy that handles lane-change scenarios by comparing lane-level trajectories computed in parallel. (2) Inside the lane-level trajectory generator, it iteratively solves path and speed optimization based on a Frenet frame. (3) For path and speed optimization, a combination of dynamic programming and spline-based quadratic programming is proposed to construct a scalable and easy-to-tune framework to handle traffic rules, obstacle decisions and smoothness simultaneously. The planner is scalable to both highway and lower-speed city driving scenarios. We also demonstrate the algorithm through scenario illustrations and on-road test results. The system described in this manuscript has been deployed to dozens of Baidu Apollo autonomous driving vehicles since Apollo v1.5 was announced in September 2017. As of May 16th, 2018, the system has been tested under 3,380 hours and approximately 68,000 kilometers (42,253 miles) of closed-loop autonomous driving under various urban scenarios. The algorithm described in this manuscript is available at https://github.com/ApolloAuto/apollo/tree/master/modules/planning.


Boosting Combinatorial Problem Modeling with Machine Learning

arXiv.org Artificial Intelligence

In the past few years, the area of Machine Learning (ML) has witnessed tremendous advancements, becoming a pervasive technology in a wide range of applications. One area that can significantly benefit from the use of ML is Combinatorial Optimization. The three pillars of constraint satisfaction and optimization problem solving, i.e., modeling, search, and optimization, can exploit ML techniques to boost their accuracy, efficiency and effectiveness. In this survey we focus on the modeling component, whose effectiveness is crucial for solving the problem. The modeling activity has been traditionally shaped by optimization and domain experts, interacting to provide realistic results. Machine Learning techniques can tremendously ease the process, and exploit the available data to either create models or refine expert-designed ones. In this survey we cover approaches that have been recently proposed to enhance the modeling process by learning either single constraints, objective functions, or the whole model. We highlight common themes to multiple approaches and draw connections with related fields of research.


A Constrained Randomized Shortest-Paths Framework for Optimal Exploration

arXiv.org Machine Learning

The present work extends the randomized shortest-paths framework (RSP), interpolating between shortest-path and random-walk routing in a network, in three directions. First, it shows how to deal with equality constraints on a subset of transition probabilities and develops a generic algorithm for solving this constrained RSP problem using Lagrangian duality. Second, it derives a surprisingly simple iterative procedure to compute the optimal, randomized, routing policy generalizing the previously developed "soft" Bellman-Ford algorithm. The resulting algorithm allows balancing exploitation and exploration in an optimal way by interpolating between a pure random behavior and the deterministic, optimal, policy (least-cost paths) while satisfying the constraints. Finally, the two algorithms are applied to Markov decision problems by considering the process as a constrained RSP on a bipartite state-action graph. In this context, the derived "soft" value iteration algorithm appears to be closely related to dynamic policy programming as well as Kullback-Leibler and path integral control, and similar to a recently introduced reinforcement learning exploration strategy. This shows that this strategy is optimal in the RSP sense - it minimizes expected path cost subject to relative entropy constraint. Simulation results on illustrative examples show that the model behaves as expected.


Unseeded low-rank graph matching by transform-based unsupervised point registration

arXiv.org Machine Learning

The problem of learning a correspondence relationship between nodes of two networks has drawn much attention of the computer science community and recently that of statisticians. The unseeded version of this problem, in which we do not know any part of the true correspondence, is a long-standing challenge. For low-rank networks, the problem can be translated into an unsupervised point registration problem, in which two point sets generated from the same distribution are matchable by an unknown orthonormal transformation. Conventional methods generally lack consistency guarantee and are usually computationally costly. In this paper, we propose a novel approach to this problem. Instead of simultaneously estimating the unknown correspondence and orthonormal transformation to match up the two point sets, we match their distributions via minimizing our designed loss function capturing the discrepancy between their Laplace transforms, thus avoiding the optimization over all possible correspondences. This dramatically reduces the dimension of the optimization problem from $\Omega(n^2)$ parameters to $O(d^2)$ parameters, where $d$ is the fixed rank, and enables convenient theoretical analysis. In this paper, we provide arguably the first consistency guarantee and explicit error rate for general low-rank models. Our method provides control over the computational complexity ranging from $\omega(n)$ (any growth rate faster than $n$) to $O(n^2)$ while pertaining consistency. We demonstrate the effectiveness of our method through several numerical examples.


Query-Efficient Hard-label Black-box Attack:An Optimization-based Approach

arXiv.org Artificial Intelligence

We study the problem of attacking a machine learning model in the hard-label black-box setting, where no model information is revealed except that the attacker can make queries to probe the corresponding hard-label decisions. This is a very challenging problem since the direct extension of state-of-the-art white-box attacks (e.g., CW or PGD) to the hard-label black-box setting will require minimizing a non-continuous step function, which is combinatorial and cannot be solved by a gradient-based optimizer. The only current approach is based on random walk on the boundary, which requires lots of queries and lacks convergence guarantees. We propose a novel way to formulate the hard-label black-box attack as a real-valued optimization problem which is usually continuous and can be solved by any zeroth order optimization algorithm. For example, using the Randomized Gradient-Free method, we are able to bound the number of iterations needed for our algorithm to achieve stationary points. We demonstrate that our proposed method outperforms the previous random walk approach to attacking convolutional neural networks on MNIST, CIFAR, and ImageNet datasets. More interestingly, we show that the proposed algorithm can also be used to attack other discrete and non-continuous machine learning models, such as Gradient Boosting Decision Trees (GBDT).


Convergence Rate of Block-Coordinate Maximization Burer-Monteiro Method for Solving Large SDPs

arXiv.org Machine Learning

Semidefinite programming (SDP) with equality constraints arise in many optimization and machine learning problems, such as Max-Cut, community detection and robust PCA. Although SDPs can be solved to arbitrary precision in polynomial time, generic convex solvers do not scale well with the dimension of the problem. In order to address this issue, Burer and Monteiro \cite{burer2003nonlinear} proposed to reduce the dimension of the problem by appealing to a low-rank factorization, and solve the subsequent non-convex problem instead. It is well-understood that the resulting non-convex problem acts as a reliable surrogate to the original SDP, and can be efficiently solved using the block-coordinate maximization method. Despite its simplicity, remarkable success, and wide use in practice, the theoretical understanding of the convergence of this method is limited. We prove that the block-coordinate maximization algorithm applied to the non-convex Burer-Monteiro approach enjoys a global sublinear rate without any assumptions on the problem, and a local linear convergence rate despite no local maxima is locally strongly concave. We illustrate our results through examples and numerical experiments.


Negative Momentum for Improved Game Dynamics

arXiv.org Machine Learning

Games generalize the optimization paradigm by introducing different objective functions for different optimizing agents, known as players. Generative Adversarial Networks (GANs) are arguably the most popular game formulation in recent machine learning literature. GANs achieve great results on generating realistic natural images, however they are known for being difficult to train. Training them involves finding a Nash equilibrium, typically performed using gradient descent on the two players' objectives. Game dynamics can induce rotations that slow down convergence to a Nash equilibrium, or prevent it altogether. We provide a theoretical analysis of the game dynamics. Our analysis, supported by experiments, shows that gradient descent with a negative momentum term can improve the convergence properties of some GANs.


Statistical Inference with Local Optima

arXiv.org Machine Learning

We study the statistical properties of an estimator derived by applying a gradient ascent method with multiple initializations to a multi-modal likelihood function. We derive the population quantity that is the target of this estimator and study the properties of confidence intervals (CIs) constructed from asymptotic normality and the bootstrap approach. In particular, we analyze the coverage deficiency due to finite number of random initializations. We also investigate the CIs by inverting the likelihood ratio test, the score test, and the Wald test, and we show that the resulting CIs may be very different. We provide a summary of the uncertainties that we need to consider while making inference about the population. Note that we do not provide a solution to the problem of multiple local maxima; instead, our goal is to investigate the effect from local maxima on the behavior of our estimator. In addition, we analyze the performance of the EM algorithm under random initializations and derive the coverage of a CI with a finite number of initializations. Finally, we extend our analysis to a nonparametric mode hunting problem.


A survey on policy search algorithms for learning robot controllers in a handful of trials

arXiv.org Machine Learning

Most policy search algorithms require thousands of training episodes to find an effective policy, which is often infeasible with a physical robot. This survey article focuses on the extreme other end of the spectrum: how can a robot adapt with only a handful of trials (a dozen) and a few minutes? By analogy with the word "big-data", we refer to this challenge as "micro-data reinforcement learning". We show that a first strategy is to leverage prior knowledge on the policy structure (e.g., dynamic movement primitives), on the policy parameters (e.g., demonstrations), or on the dynamics (e.g., simulators). A second strategy is to create data-driven surrogate models of the expected reward (e.g., Bayesian optimization) or the dynamical model (e.g., model-based policy search), so that the policy optimizer queries the model instead of the real system. Overall, all successful micro-data algorithms combine these two strategies by varying the kind of model and prior knowledge. The current scientific challenges essentially revolve around scaling up to complex robots (e.g., humanoids), designing generic priors, and optimizing the computing time.


Last-Iterate Convergence: Zero-Sum Games and Constrained Min-Max Optimization

arXiv.org Machine Learning

Motivated by applications in Game Theory, Optimization, and Generative Adversarial Networks, recent work of Daskalakis et al. and Liang and Stokes has established that a variant of the widely used Gradient Descent/Ascent procedure, called "Optimistic Gradient Descent/Ascent (OGDA)", exhibits last-iterate convergence to saddle points in {\em unconstrained} convex-concave min-max optimization problems. We show that the same holds true in the more general problem of {\em constrained} min-max optimization under a variant of the Multiplicative-Weights-Update method called "Optimistic Multiplicative-Weights Update (OMWU)". The generality of the constrained problem, which in particular captures all Linear Programming, requires fundamentally different techniques for analyzing the progress of OMWU towards min-max solutions. We show that OMWU monotonically improves the Kullback-Leibler divergence of the current iterate to the (appropriately normalized) min-max solution until it enters a neighborhood of the solution. Inside that neighborhood we show that OMWU becomes a contracting map converging to the exact solution. We experiment with zero-sum games to measure how the convergence rate scales with the dimension.