Goto

Collaborating Authors

 Mathematical & Statistical Methods


Linearly-solvable Markov decision problems

Neural Information Processing Systems

We introduce a class of MPDs which greatly simplify Reinforcement Learning. They have discrete state spaces and continuous control spaces. The controls have the effect of rescaling the transition probabilities of an underlying Markov chain. A control cost penalizing KL divergence between controlled and uncontrolled transition probabilities makes the minimization problem convex, and allows analytical computation of the optimal controls given the optimal value function. An exponential transformation of the optimal value function makes the minimized Bellman equation linear.


Fast Computation of Graph Kernels

Neural Information Processing Systems

Using extensions of linear algebra concepts to Reproducing Kernel Hilbert Spaces (RKHS), we define a unifying framework for random walk kernels on graphs.


Supervised Machine Learning with a Novel Kernel Density Estimator

arXiv.org Machine Learning

In recent years, kernel density estimation has been exploited by computer scientists to model machine learning problems. The kernel density estimation based approaches are of interest due to the low time complexity of either O(n) or O(n*log(n)) for constructing a classifier, where n is the number of sampling instances. Concerning design of kernel density estimators, one essential issue is how fast the pointwise mean square error (MSE) and/or the integrated mean square error (IMSE) diminish as the number of sampling instances increases. In this article, it is shown that with the proposed kernel function it is feasible to make the pointwise MSE of the density estimator converge at O(n^-2/3) regardless of the dimension of the vector space, provided that the probability density function at the point of interest meets certain conditions.


Mixed Integer Linear Programming For Exact Finite-Horizon Planning In Decentralized Pomdps

arXiv.org Artificial Intelligence

We consider the problem of finding an n-agent joint-policy for the optimal finite-horizon control of a decentralized Pomdp (Dec-Pomdp). This is a problem of very high complexity (NEXP-hard in n >= 2). In this paper, we propose a new mathematical programming approach for the problem. Our approach is based on two ideas: First, we represent each agent's policy in the sequence-form and not in the tree-form, thereby obtaining a very compact representation of the set of joint-policies. Second, using this compact representation, we solve this problem as an instance of combinatorial optimization for which we formulate a mixed integer linear program (MILP). The optimal solution of the MILP directly yields an optimal joint-policy for the Dec-Pomdp. Computational experience shows that formulating and solving the MILP requires significantly less time to solve benchmark Dec-Pomdp problems than existing algorithms. For example, the multi-agent tiger problem for horizon 4 is solved in 72 secs with the MILP whereas existing algorithms require several hours to solve it.


Nonparametric Representation of Policies and Value Functions: A Trajectory-Based Approach

Neural Information Processing Systems

A longstanding goal of reinforcement learning is to develop nonparametric representations of policies and value functions that support rapid learning without suffering from interference or the curse of dimensionality. We have developed a trajectory-based approach, in which policies and value functions are represented nonparametrically along trajectories. These trajectories, policies, and value functions are updated as the value function becomes more accurate or as a model of the task is updated. We have applied this approach to periodic tasks such as hopping and walking, which required handling discount factors and discontinuities in the task dynamics, and using function approximation to represent value functions at discontinuities. We also describe extensions of the approach to make the policies more robust to modeling error and sensor noise.


On Iterative Krylov-Dogleg Trust-Region Steps for Solving Neural Networks Nonlinear Least Squares Problems

Neural Information Processing Systems

Our al exploits the special structure of the sum of squared error measure in Equation (1); hence, the other objective functions are outside the scope of this paper. The gradient vector and Hessian matrix are given by g g(9) JT rand H H(9) JT J S, where J is the m x n Jacobian matrix of r, and S denotes the matrix of second-derivative terms. If S is simply omitted based on the "small residual" assumption, then the Hessian matrix reduces to the Gauss-Newton model Hessian: i.e., JT J. Furthermore, a family of quasi-Newton methods can be applied to approximate term S alone, leading to the augmented Gauss-Newton model Hessian (see, for example, Mizutani [2] and references therein).


Tree-Based Modeling and Estimation of Gaussian Processes on Graphs with Cycles

Neural Information Processing Systems

We present the embedded trees algorithm, an iterative technique for estimation of Gaussian processes defined on arbitrary graphs. By exactly solving a series of modified problems on embedded spanning trees,it computes the conditional means with an efficiency comparable to or better than other techniques. Theerror covariance computation is most efficient for graphs in which removing a small number of edges reveals an embedded tree.In this context, we demonstrate that sparse loopy graphs can provide a significant increase in modeling power relative totrees, with only a minor increase in estimation complexity. 1 Introduction Graphical models are an invaluable tool for defining and manipulating probability distributions. In modeling stochastic processes with graphical models, two basic problems arise: (i) specifying a class of graphs with which to model or approximate the process; and (ii) determining efficient techniques for statistical inference. At one extreme are tree-structured graphs: although they lead to highly efficient algorithms for estimation [1, 2], their modeling power is often limited.


Neural Control for Nonlinear Dynamic Systems

Neural Information Processing Systems

A neural network based approach is presented for controlling two distinct types of nonlinear systems. The first corresponds to nonlinear systems with parametric uncertainties where the parameters occur nonlinearly. The second corresponds to systems for which stabilizing control structures cannot be determined. The proposed neural controllers are shown to result in closed-loop system stability under certain conditions.


Neural Control for Nonlinear Dynamic Systems

Neural Information Processing Systems

A neural network based approach is presented for controlling two distinct types of nonlinear systems. The first corresponds to nonlinear systems with parametric uncertainties where the parameters occur nonlinearly. The second corresponds to systems for which stabilizing control structures cannotbe determined. The proposed neural controllers are shown to result in closed-loop system stability under certain conditions.