Goto

Collaborating Authors

 Undirected Networks


Coastal Navigation with Mobile Robots

Neural Information Processing Systems

The problem that we address in this paper is how a mobile robot can plan in order to arrive at its goal with minimum uncertainty. Traditional motion planning algo(cid:173) rithms often assume that a mobile robot can track its position reliably, however, in real world situations, reliable localization may not always be feasible. Partially Observable Markov Decision Processes (POMDPs) provide one way to maximize the certainty of reaching the goal state, but at the cost of computational intractability for large state spaces. The method we propose explicitly models the uncertainty of the robot's position as a state variable, and generates trajectories through the augmented pose-uncertainty space. By minimizing the positional uncertainty at the goal, the robot reduces the likelihood it becomes lost.


On Input Selection with Reversible Jump Markov Chain Monte Carlo Sampling

Neural Information Processing Systems

In this paper we will treat input selection for a radial basis function (RBF) like classifier within a Bayesian framework. We approximate the a-posteriori distribution over both model coefficients and input subsets by samples drawn with Gibbs updates and reversible jump moves. Using some public datasets, we compare the classification accuracy of the method with a conventional ARD scheme. These datasets are also used to infer the a-posteriori probabilities of dif(cid:173) ferent input subsets.


The Nonnegative Boltzmann Machine

Neural Information Processing Systems

The nonnegative Boltzmann machine (NNBM) is a recurrent neural net(cid:173) work model that can describe multimodal nonnegative data. Application of maximum likelihood estimation to this model gives a learning rule that is analogous to the binary Boltzmann machine. We examine the utility of the mean field approximation for the NNBM, and describe how Monte Carlo sampling techniques can be used to learn its parameters. Reflec(cid:173) tive slice sampling is particularly well-suited for this distribution, and can efficiently be implemented to sample the distribution. We illustrate learning of the NNBM on a transiationally invariant distribution, as well as on a generative model for images of human faces.


Constrained Hidden Markov Models

Neural Information Processing Systems

By thinking of each state in a hidden Markov model as corresponding to some spatial region of a fictitious topology space it is possible to naturally define neigh(cid:173) bouring states as those which are connected in that space. The transition matrix can then be constrained to allow transitions only between neighbours; this means that all valid state sequences correspond to connected paths in the topology space. I show how such constrained HMMs can learn to discover underlying structure in complex sequences of high dimensional data, and apply them to the problem of recovering mouth movements from acoustics in continuous speech. Probabilistic unsupervised learning for such sequences requires models with two essential features: latent (hidden) variables and topology in those variables. Hidden Markov models (HMMs) can be thought of as dynamic generalizations of discrete state static data models such as Gaussian mixtures, or as discrete state versions of linear dynam(cid:173) ical systems (LDSs) (which are themselves dynamic generalizations of continuous latent variable models such as factor analysis).


Spiking Boltzmann Machines

Neural Information Processing Systems

We first show how to represent sharp posterior probability distribu(cid:173) tions using real valued coefficients on broadly-tuned basis functions. Then we show how the precise times of spikes can be used to con(cid:173) vey the real-valued coefficients on the basis functions quickly and accurately. Finally we describe a simple simulation in which spik(cid:173) ing neurons learn to model an image sequence by fitting a dynamic generative model. A perceived object is represented in the brain by the activities of many neurons, but there is no general consensus on how the activities of individual neurons combine to represent the multiple properties of an object. We start by focussing on the case of a single object that has multiple instantiation parameters such as position, velocity, size and orientation. We assume that each neuron has an ideal stimulus in the space of instantiation parameters and that its activation rate or probability of activation falls off monotonically in all directions as the actual stimulus departs from this ideal.


Policy Search via Density Estimation

Neural Information Processing Systems

We propose a new approach to the problem of searching a space of stochastic controllers for a Markov decision process (MDP) or a partially observable Markov decision process (POMDP). Following several other authors, our approach is based on searching in parameterized families of policies (for example, via gradient descent) to optimize solution qual(cid:173) ity. However, rather than trying to estimate the values and derivatives of a policy directly, we do so indirectly using estimates for the proba(cid:173) bility densities that the policy induces on states at the different points in time. This enables our algorithms to exploit the many techniques for efficient and robust approximate density propagation in stochastic sys(cid:173) tems. We show how our techniques can be applied both to deterministic propagation schemes (where the MDP's dynamics are given explicitly in compact form,) and to stochastic propagation schemes (where we have access only to a generative model, or simulator, of the MDP).


Bayesian Modelling of fMRI lime Series

Neural Information Processing Systems

We present a Hidden Markov Model (HMM) for inferring the hidden psychological state (or neural activity) during single trial tMRI activa(cid:173) tion experiments with blocked task paradigms. Inference is based on Bayesian methodology, using a combination of analytical and a variety of Markov Chain Monte Carlo (MCMC) sampling techniques. The ad(cid:173) vantage of this method is that detection of short time learning effects be(cid:173) tween repeated trials is possible since inference is based only on single trial experiments.


Approximate Planning in Large POMDPs via Reusable Trajectories

Neural Information Processing Systems

We consider the problem of reliably choosing a near-best strategy from a restricted class of strategies TI in a partially observable Markov deci(cid:173) sion process (POMDP). We assume we are given the ability to simulate the POMDP, and study what might be called the sample complexity - that is, the amount of data one must generate in the POMDP in order to choose a good strategy. We prove upper bounds on the sample com(cid:173) plexity showing that, even for infinitely large and arbitrarily complex POMDPs, the amount of data needed can be finite, and depends only linearly on the complexity of the restricted strategy class TI, and expo(cid:173) nentially on the horizon time. This latter dependence can be eased in a variety of ways, including the application of gradient and local search algorithms.


Hierarchical Image Probability (H1P) Models

Neural Information Processing Systems

We formulate a model for probability distributions on image spaces. We show that any distribution of images can be factored exactly into condi(cid:173) tional distributions of feature vectors at one resolution (pyramid level) conditioned on the image information at lower resolutions. We would like to factor this over positions in the pyramid levels to make it tractable, but such factoring may miss long-range dependencies. To fix this, we in(cid:173) troduce hidden class labels at each pixel in the pyramid. The result is a hierarchical mixture of conditional probabilities, similar to a hidden Markov model on a tree.


Monte Carlo POMDPs

Neural Information Processing Systems

We present a Monte Carlo algorithm for learning to act in partially observable Markov decision processes (POMDPs) with real-valued state and action spaces. Our approach uses importance sampling for representing beliefs, and Monte Carlo approximation for belief propagation. A reinforcement learning algorithm, value iteration, is employed to learn value functions over belief states. Finally, a sample(cid:173) based version of nearest neighbor is used to generalize across states. Initial empirical results suggest that our approach works well in practical applications.