Goto

Collaborating Authors

 Optimization


Bandit-Based Planning and Learning in Continuous-Action Markov Decision Processes

AAAI Conferences

Recent research leverages results from the continuous-armed bandit literature to create a reinforcement-learning algorithm for continuous state and action spaces. Initially proposed in a theoretical setting, we provide the first examination of the empirical properties of the algorithm. Through experimentation, we demonstrate the effectiveness of this planning method when coupled with exploration and model learning and show that, in addition to its formal guarantees, the approach is very competitive with other continuous-action reinforcement learners.


Route Planning for Bicycles โ€” Exact Constrained Shortest Paths Made Practical via Contraction Hierarchy

AAAI Conferences

We consider the problem of computing shortest paths subject to an additional resource constraint such as a hard limit on the (positive) height difference of the path. This is typically of interest in the context of bicycle route planning, or when energy consumption is to be limited. So far, the exact computation of such constrained shortest paths was not feasible on large networks; we show that state-of-the-art speed-up techniques for the shortest path problem, like contraction hierarchies, can be instrumented to solve this problem efficiently in practice despite the NP-hardness in general.


ITOMP: Incremental Trajectory Optimization for Real-Time Replanning in Dynamic Environments

AAAI Conferences

We present a novel optimization-based algorithm for motion planning in dynamic environments. Our approach uses a stochastic trajectory optimization framework to avoid collisions and satisfy smoothness and dynamics constraints. Our algorithm does not require a priori knowledge about global motion or trajectories of dynamic obstacles. Rather, we compute a conservative local bound on the position or trajectory of each obstacle over a short time and use the bound to compute a collision-free trajectory for the robot in an incremental manner. Moreover, we interleave planning and execution of the robot in an adaptive manner to balance between the planning horizon and responsiveness to obstacle. We highlight the performance of our planner in a simulated dynamic environment with the 7-DOF PR2 robot arm and dynamic obstacles.


Predicting Optimal Solution Cost with Bidirectional Stratified Sampling

AAAI Conferences

Optimal planning and heuristic search systems solve state-space searchproblems by finding a least-cost path from start to goal. As a byproduct of having an optimal path they also determine the optimal solution cost. In this paper we focus on the problem of determining the optimal solution cost for a state-space search problem directly, i.e. without actually finding a solution path of that cost. We present an efficient algorithm, BiSS, based on ideas of bidirectional search and stratified sampling that produces accurate estimates of the optimal solution cost. Our method is guaranteed to return the optimal solution cost in the limit as the sample size goes to infinity.We show empirically that our method makes accurate predictions in several domains. In addition, we show that our method scales to state spaces much larger than can be solved optimally. In particular, we estimate the average solution cost for the 6x6, 7x7, and 8x8 Sliding-Tile Puzzle and provide indirect evidence that these estimates are accurate.


Greedy expansions in convex optimization

arXiv.org Machine Learning

This paper is a follow up to the previous author's paper on convex optimization. In that paper we began the process of adjusting greedy-type algorithms from nonlinear approximation for finding sparse solutions of convex optimization problems. We modified there three the most popular in nonlinear approximation in Banach spaces greedy algorithms -- Weak Chebyshev Greedy Algorithm, Weak Greedy Algorithm with Free Relaxation and Weak Relaxed Greedy Algorithm -- for solving convex optimization problems. We continue to study sparse approximate solutions to convex optimization problems. It is known that in many engineering applications researchers are interested in an approximate solution of an optimization problem as a linear combination of elements from a given system of elements. There is an increasing interest in building such sparse approximate solutions using different greedy-type algorithms. In this paper we concentrate on greedy algorithms that provide expansions, which means that the approximant at the $m$th iteration is equal to the sum of the approximant from the previous iteration ($(m-1)$th iteration) and one element from the dictionary with an appropriate coefficient. The problem of greedy expansions of elements of a Banach space is well studied in nonlinear approximation theory. At a first glance the setting of a problem of expansion of a given element and the setting of the problem of expansion in an optimization problem are very different. However, it turns out that the same technique can be used for solving both problems. We show how the technique developed in nonlinear approximation theory, in particular, the greedy expansions technique can be adjusted for finding a sparse solution of an optimization problem given by an expansion with respect to a given dictionary.


Greedy approximation in convex optimization

arXiv.org Machine Learning

We study sparse approximate solutions to convex optimization problems. It is known that in many engineering applications researchers are interested in an approximate solution of an optimization problem as a linear combination of elements from a given system of elements. There is an increasing interest in building such sparse approximate solutions using different greedy-type algorithms. The problem of approximation of a given element of a Banach space by linear combinations of elements from a given system (dictionary) is well studied in nonlinear approximation theory. At a first glance the settings of approximation and optimization problems are very different. In the approximation problem an element is given and our task is to find a sparse approximation of it. In optimization theory an energy function is given and we should find an approximate sparse solution to the minimization problem. It turns out that the same technique can be used for solving both problems. We show how the technique developed in nonlinear approximation theory, in particular, the greedy approximation technique can be adjusted for finding a sparse solution of an optimization problem.


Sparse Trace Norm Regularization

arXiv.org Machine Learning

We study the problem of estimating multiple predictive functions from a dictionary of basis functions in the nonparametric regression setting. Our estimation scheme assumes that each predictive function can be estimated in the form of a linear combination of the basis functions. By assuming that the coefficient matrix admits a sparse low-rank structure, we formulate the function estimation problem as a convex program regularized by the trace norm and the $\ell_1$-norm simultaneously. We propose to solve the convex program using the accelerated gradient (AG) method and the alternating direction method of multipliers (ADMM) respectively; we also develop efficient algorithms to solve the key components in both AG and ADMM. In addition, we conduct theoretical analysis on the proposed function estimation scheme: we derive a key property of the optimal solution to the convex program; based on an assumption on the basis functions, we establish a performance bound of the proposed function estimation scheme (via the composite regularization). Simulation studies demonstrate the effectiveness and efficiency of the proposed algorithms.


Sparse Approximation via Penalty Decomposition Methods

arXiv.org Machine Learning

In this paper we consider sparse approximation problems, that is, general $l_0$ minimization problems with the $l_0$-"norm" of a vector being a part of constraints or objective function. In particular, we first study the first-order optimality conditions for these problems. We then propose penalty decomposition (PD) methods for solving them in which a sequence of penalty subproblems are solved by a block coordinate descent (BCD) method. Under some suitable assumptions, we establish that any accumulation point of the sequence generated by the PD methods satisfies the first-order optimality conditions of the problems. Furthermore, for the problems in which the $l_0$ part is the only nonconvex part, we show that such an accumulation point is a local minimizer of the problems. In addition, we show that any accumulation point of the sequence generated by the BCD method is a saddle point of the penalty subproblem. Moreover, for the problems in which the $l_0$ part is the only nonconvex part, we establish that such an accumulation point is a local minimizer of the penalty subproblem. Finally, we test the performance of our PD methods by applying them to sparse logistic regression, sparse inverse covariance selection, and compressed sensing problems. The computational results demonstrate that our methods generally outperform the existing methods in terms of solution quality and/or speed.


A Mixed Integer Programming Model Formulation for Solving the Lot-Sizing Problem

arXiv.org Artificial Intelligence

This paper addresses a mixed integer programming (MIP) formulation for the multi-item uncapacitated lot-sizing problem that is inspired from the trailer manufacturer. The proposed MIP model has been utilized to find out the optimum order quantity, optimum order time, and the minimum total cost of purchasing, ordering, and holding over the predefined planning horizon. This problem is known as NP-hard problem. The model was presented in an optimal software form using LINGO 13.0.


Solving Limited Memory Influence Diagrams

Journal of Artificial Intelligence Research

We present a new algorithm for exactly solving decision making problems represented as influence diagrams. We do not require the usual assumptions of no forgetting and regularity; this allows us to solve problems with simultaneous decisions and limited information. The algorithm is empirically shown to outperform a state-of-the-art algorithm on randomly generated problems of up to 150 variables and 10^64 solutions. We show that these problems are NP-hard even if the underlying graph structure of the problem has low treewidth and the variables take on a bounded number of states, and that they admit no provably good approximation if variables can take on an arbitrary number of states.