Well File:


Scheduling Straight-Line Code Using Reinforcement Learning and Rollouts

Neural Information Processing Systems

In 1986, Tanner and Mead [1] implemented an interesting constraint satisfaction circuitfor global motion sensing in aVLSI. We report here a new and improved aVLSI implementation that provides smooth optical flow as well as global motion in a two dimensional visual field. The computation ofoptical flow is an ill-posed problem, which expresses itself as the aperture problem. However, the optical flow can be estimated by the use of regularization methods, in which additional constraints are introduced interms of a global energy functional that must be minimized. We show how the algorithmic constraints of Hom and Schunck [2] on computing smoothoptical flow can be mapped onto the physical constraints of an equivalent electronic network.



Adding Constrained Discontinuities to Gaussian Process Models of Wind Fields

Neural Information Processing Systems

Gaussian Processes provide good prior models for spatial data, but can be too smooth. In many physical situations there are discontinuities along bounding surfaces, for example fronts in near-surface wind fields. We describe a modelling method for such a constrained discontinuity and demonstrate how to infer the model parameters in wind fields with MCMC sampling.


Signal Detection in Noisy Weakly-Active Dendrites

Neural Information Processing Systems

Here we derive measures quantifying the information loss of a synaptic signal due to the presence of neuronal noise sources, as it electrotonically propagates along a weakly-active dendrite. We model the dendrite as an infinite linear cable, with noise sources distributed along its length. The noise sources we consider are thermal noise, channel noise arising from the stochastic nature of voltage-dependent ionic channels (K and Na) and synaptic noise due to spontaneous background activity. We assess the efficacy of information transfer using a signal detection paradigm where the objective is to detect the presence/absence of a presynaptic spike from the post-synaptic membrane voltage. This allows us to analytically assess the role of each of these noise sources in information transfer. For our choice of parameters, we find that the synaptic noise is the dominant noise source which limits the maximum length over which information be reliably transmitted. 1 Introduction This is a continuation of our efforts (Manwani and Koch, 1998) to understand the information capacityofa neuronal link (in terms of the specific nature of neural "hardware") by a systematic study of information processing at different biophysical stages in a model of a single neuron. Here we investigate how the presence of neuronal noise sources influences the information transmission capabilities of a simplified model of a weakly-active dendrite. The noise sources we include are, thermal noise, channel noise arising from the stochastic nature of voltage-dependent channels (K and Na) and synaptic noise due to spontaneous background activity. We characterize the noise sources using analytical expressions of their current power spectral densities and compare their magnitudes for dendritic parameters reported inliterature (Mainen and Sejnowski, 1998).



Reinforcement Learning for Trading

Neural Information Processing Systems

Inthis paper, we propose to use recurrent reinforcement learning to directly optimize such trading system performance functions, and we compare two different reinforcementlearning methods. The first, Recurrent Reinforcement Learning, uses immediate rewards to train the trading systems, while the second (Q-Learning (Watkins 1989)) approximates discounted future rewards. These methodologies can be applied to optimizing systems designed to trade a single security or to trade portfolios .In addition, we propose a novel value function for risk-adjusted return that enables learning to be done online: the differential Sharpe ratio. Trading system profits depend upon sequences of interdependent decisions, and are thus path-dependent. Optimal trading decisions when the effects of transactions costs, market impact and taxes are included require knowledge of the current system state. In Moody, Wu, Liao & Saffell (1998), we demonstrate that reinforcement learning provides a more elegant and effective means for training trading systems when transaction costs are included, than do more standard supervised approaches.


Robot Docking Using Mixtures of Gaussians

Neural Information Processing Systems

This paper applies the Mixture of Gaussians probabilistic model, combined withExpectation Maximization optimization to the task of summarizing threedimensional range data for a mobile robot. This provides a flexible way of dealing with uncertainties in sensor information, and allows theintroduction of prior knowledge into low-level perception modules. Problemswith the basic approach were solved in several ways: the mixture of Gaussians was reparameterized to reflect the types of objects expected in the scene, and priors on model parameters were included in the optimization process. Both approaches force the optimization to find'interesting' objects, given the sensor and object characteristics. A higher level classifier was used to interpret the results provided by the model, and to reject spurious solutions.


Optimizing Admission Control while Ensuring Quality of Service in Multimedia Networks via Reinforcement Learning

Neural Information Processing Systems

This paper examines the application of reinforcement learning to a telecommunications networking problem. The problem requires that revenue bemaximized while simultaneously meeting a quality of service constraint that forbids entry into certain states. We present a general solution to this multi-criteria problem that is able to earn significantly higher revenues than alternatives.


A Phase Space Approach to Minimax Entropy Learning and the Minutemax Approximations

Neural Information Processing Systems

There has been much recent work on measuring image statistics and on learning probability distributions on images. We observe that the mapping from images to statistics is many-to-one and show it can be quantified by a phase space factor. This phase space approach throws light on the Minimax Entropy technique for learning Gibbs distributions on images with potentials derived from image statistics and elucidates the ambiguities that are inherent to determining the potentials. In addition, it shows that if the phase factor can be approximated by an analytic distribution then this approximation yields a swift "Minutemax" algorithm that vastly reduces the computation time for Minimax entropy learning. An illustration of this concept, using a Gaussian to approximate the phase factor, gives a good approximation to the results of Zhu and Mumford (1997) in just seconds of CPU time. The phase space approach also gives insight into the multi-scale potentials found by Zhu and Mumford (1997) and suggests that the forms of the potentials are influenced greatly by phase space considerations. Finally, we prove that probability distributions learned in feature space alone are equivalent to Minimax Entropy learning with a multinomial approximation of the phase factor. 1 Introduction Bayesian probability theory gives a powerful framework for visual perception (Knill and Richards 1996). This approach, however, requires specifying prior probabilities and likelihood functions. Learning these probabilities is difficult because it requires estimating distributions on random variables of very high dimensions (for example, images with 200 x 200 pixels, or shape curves of length 400 pixels).