Country
A Practical Monte Carlo Implementation of Bayesian Learning
A practical method for Bayesian training of feed-forward neural networks using sophisticated Monte Carlo methods is presented and evaluated. In reasonably small amounts of computer time this approach outperforms other state-of-the-art methods on 5 datalimited tasksfrom real world domains. 1 INTRODUCTION Bayesian learning uses a prior on model parameters, combines this with information from a training set, and then integrates over the resulting posterior to make predictions. Withthis approach, we can use large networks without fear of overfitting, allowing us to capture more structure in the data, thus improving prediction accuracy andeliminating the tedious search (often performed using cross validation) for the model complexity that optimises the bias/variance tradeoff. In this approach the size of the model is limited only by computational considerations. The application of Bayesian learning to neural networks has been pioneered by MacKay (1992), who uses a Gaussian approximation to the posterior weight distribution.
Parallel Optimization of Motion Controllers via Policy Iteration
Jr., Jefferson A. Coelho, Sitaraman, R., Grupen, Roderic A.
This paper describes a policy iteration algorithm for optimizing the performance of a harmonic function-based controller with respect to a user-defined index. Value functions are represented as potential distributionsover the problem domain, being control policies represented as gradient fields over the same domain. All intermediate policiesare intrinsically safe, i.e. collisions are not promoted during the adaptation process. The algorithm has efficient implementation inparallel SIMD architectures. One potential application - travel distance minimization - illustrates its usefulness.
EM Optimization of Latent-Variable Density Models
Bishop, Christopher M., Svensรฉn, Markus, Williams, Christopher K. I.
There is currently considerable interest in developing general nonlinear densitymodels based on latent, or hidden, variables. Such models have the ability to discover the presence of a relatively small number of underlying'causes' which, acting in combination, give rise to the apparent complexity of the observed data set. Unfortunately, totrain such models generally requires large computational effort. In this paper we introduce a novel latent variable algorithm which retains the general nonlinear capabilities of previous models but which uses a training procedure based on the EM algorithm. We demonstrate the performance of the model on a toy problem and on data from flow diagnostics for a multiphase oil pipeline.
Modeling Interactions of the Rat's Place and Head Direction Systems
Redish, A. David, Touretzky, David S.
We have developed a computational theory of rodent navigation that includes analogs of the place cell system, the head direction system, and path integration. In this paper we present simulation results showing how interactions between the place and head direction systems can account for recent observations about hippocampal place cell responses to doubling and/or rotation of cue cards in a cylindrical arena (Sharp et at.,
Improving Policies without Measuring Merits
Dayan, Peter, Singh, Satinder P.
Performing policy iteration in dynamic programming should only require knowledge of relative rather than absolute measures of the utility of actions (Werbos, 1991) - what Baird (1993) calls the advantages ofactions at states. Nevertheless, most existing methods in dynamic programming (including Baird's) compute some form of absolute utility function. For smooth problems, advantages satisfy two differential consistency conditions (including the requirement that they be free of curl), and we show that enforcing these can lead to appropriate policy improvement solely in terms of advantages. 1 Introd uction In deciding how to change a policy at a state, an agent only needs to know the differences (called advantages) between the total return based on taking each action a for one step and then following the policy forever after, and the total return based on always following the policy (the conventional value of the state under the policy). The advantages are like differentials - they do not depend on the local levels of the total return. Indeed, Werbos (1991) defined Dual Heuristic Programming (DHP), using these facts, learning the derivatives of these total returns with respect to the state.
Classifying Facial Action
Bartlett, Marian Stewart, Viola, Paul A., Sejnowski, Terrence J., Golomb, Beatrice A., Larsen, Jan, Hager, Joseph C., Ekman, Paul
Measurement of facial expressions is important for research and assessment psychiatry, neurology,and experimental psychology (Ekman, Huang, Sejnowski, & Hager, 1992), and has technological applications in consumer-friendly user interfaces, interactive videoand entertainment rating. The Facial Action Coding System (FACS) is a method for measuring facial expressions in terms of activity in the underlying facial muscles (Ekman & Friesen, 1978). We are exploring ways to automate FACS.
Stable LInear Approximations to Dynamic Programming for Stochastic Control Problems with Local Transitions
Roy, Benjamin Van, Tsitsiklis, John N.
Recently, however, there have been some successful applications of neural networks in a totally different context - that of sequential decision making under uncertainty (stochastic control). Stochastic control problems have been studied extensively in the operations research and control theory literature for a long time, using the methodology of dynamic programming [Bertsekas, 1995]. In dynamic programming, the most important object is the cost-to-go (or value) junction, which evaluates the expected future 1046 B.V. ROY, 1. N. TSITSIKLIS
Using Feedforward Neural Networks to Monitor Alertness from Changes in EEG Correlation and Coherence
Makeig, Scott, Jung, Tzyy-Ping, Sejnowski, Terrence J.
We report here that changes in the normalized electroencephalographic (EEG)cross-spectrum can be used in conjunction with feedforward neural networks to monitor changes in alertness of operators continuouslyand in near-real time. Previously, we have shown that EEG spectral amplitudes covary with changes in alertness asindexed by changes in behavioral error rate on an auditory detection task [6,4]. Here, we report for the first time that increases in the frequency of detection errors in this task are also accompanied bypatterns of increased and decreased spectral coherence in several frequency bands and EEG channel pairs. Relationships between EEG coherence and performance vary between subjects, but within subjects, their topographic and spectral profiles appear stable from session to session. Changes in alertness also covary with changes in correlations among EEG waveforms recorded at different scalp sites, and neural networks can also estimate alertness fromcorrelation changes in spontaneous and unobtrusivelyrecorded EEGsignals. 1 Introduction When humans become drowsy, EEG scalp recordings of potential oscillations change dramatically in frequency, amplitude, and topographic distribution [3]. These changes are complex and differ between subjects [10]. Recently, we have shown 932 S.MAKEIG, T.-P.
Tempering Backpropagation Networks: Not All Weights are Created Equal
Schraudolph, Nicol N., Sejnowski, Terrence J.
Backpropagation learning algorithms typically collapse the network's structure into a single vector of weight parameters to be optimized. We suggest that their performance may be improved by utilizing the structural informationinstead of discarding it, and introduce a framework for ''tempering'' each weight accordingly. In the tempering model, activation and error signals are treated as approximately independentrandom variables. The characteristic scale of weight changes is then matched to that ofthe residuals, allowing structural properties suchas a node's fan-in and fan-out to affect the local learning rate and backpropagated error. The model also permits calculation of an upper bound on the global learning rate for batch updates, which in turn leads to different update rules for bias vs. non-bias weights. This approach yields hitherto unparalleled performance on the family relations benchmark,a deep multi-layer network: for both batch learning with momentum and the delta-bar-delta algorithm, convergence at the optimal learning rate is sped up by more than an order of magnitude.
Beating a Defender in Robotic Soccer: Memory-Based Learning of a Continuous Function
Stone, Peter, Veloso, Manuela M.
Our research works towards this broad goal from a Machine Learning perspective. We are particularly interested in investigating how an intelligent agentcan choose an action in an adversarial environment. We assume that the agent has a specific goal to achieve. We conduct this investigation in a framework whereteams of agents compete in a game of robotic soccer. The real system of model cars remotely controlled from off-board computers is under development.