Goto

Collaborating Authors

 Machine Learning


Stable Dynamic Parameter Adaption

Neural Information Processing Systems

A stability criterion for dynamic parameter adaptation is given. In the case of the learning rate of backpropagation, a class of stable algorithms is presented and studied, including a convergence proof.


Onset-based Sound Segmentation

Neural Information Processing Systems

A technique for segmenting sounds using processing based on mammalian early auditory processing is presented. The technique is based on features in sound which neuron spike recording suggests are detected in the cochlear nucleus. The sound signal is bandpassed and each signal processed to enhance onsets and offsets. The onset and offset signals are compressed, then clustered both in time and across frequency channels using a network of integrateand-fire neurons. Onsets and offsets are signalled by spikes, and the timing of these spikes used to segment the sound. 1 Background Traditional speech interpretation techniques based on Fourier transforms, spectrum recoding, and a hidden Markov model or neural network interpretation stage have limitations both in continuous speech and in interpreting speech in the presence of noise, and this has led to interest in front ends modelling biological auditory systems for speech interpretation systems (Ainsworth and Meyer 92; Cosi 93; Cole et al 95).


Gaussian Processes for Regression

Neural Information Processing Systems

The Bayesian analysis of neural networks is difficult because a simple prior over weights implies a complex prior distribution over functions. In this paper we investigate the use of Gaussian process priors over functions, which permit the predictive Bayesian analysis for fixed values of hyperparameters to be carried out exactly using matrix operations. Two methods, using optimization and averaging (via Hybrid Monte Carlo) over hyperparameters have been tested on a number of challenging problems and have produced excellent results. 1 INTRODUCTION In the Bayesian approach to neural networks a prior distribution over the weights induces a prior distribution over functions. This prior is combined with a noise model, which specifies the probability of observing the targets t given function values y, to yield a posterior over functions which can then be used for predictions. For neural networks the prior over functions has a complex form which means that implementations must either make approximations (e.g.


Recurrent Neural Networks for Missing or Asynchronous Data

Neural Information Processing Systems

In this paper we propose recurrent neural networks with feedback into the input units for handling two types of data analysis problems. On the one hand, this scheme can be used for static data when some of the input variables are missing. On the other hand, it can also be used for sequential data, when some of the input variables are missing or are available at different frequencies.


Improved Gaussian Mixture Density Estimates Using Bayesian Penalty Terms and Network Averaging

Neural Information Processing Systems

We compare two regularization methods which can be used to improve the generalization capabilities of Gaussian mixture density estimates. The first method uses a Bayesian prior on the parameter space. We derive EM (Expectation Maximization) update rules which maximize the a posterior parameter probability. In the second approach we apply ensemble averaging to density estimation. This includes Breiman's "bagging", which recently has been found to produce impressive results for classification networks.


Investment Learning with Hierarchical PSOMs

Neural Information Processing Systems

We propose a hierarchical scheme for rapid learning of context dependent "skills" that is based on the recently introduced "Parameterized Self Organizing Map" ("PSOM"). The underlying idea is to first invest some learning effort to specialize the system into a rapid learner for a more restricted range of contexts. The specialization is carried out by a prior "investment learning stage", during which the system acquires a set of basis mappings or "skills" for a set of prototypical contexts. Adaptation of a "skill" to a new context can then be achieved by interpolating in the space of the basis mappings and thus can be extremely rapid. We demonstrate the potential of this approach for the task of a 3D visuomotor map for a Puma robot and two cameras. This includes the forward and backward robot kinematics in 3D end effector coordinates, the 2D 2D retina coordinates and also the 6D joint angles. After the investment phase the transformation can be learned for a new camera setup with a single observation.


Handwritten Word Recognition using Contextual Hybrid Radial Basis Function Network/Hidden Markov Models

Neural Information Processing Systems

A hybrid and contextual radial basis function networklhidden Markov model off-line handwritten word recognition system is presented. The task assigned to the radial basis function networks is the estimation of emission probabilities associated to Markov states. The model is contextual because the estimation of emission probabilities takes into account the left context of the current image segment as represented by its predecessor in the sequence. The new system does not outperform the previous system without context but acts differently.


Dynamics of On-Line Gradient Descent Learning for Multilayer Neural Networks

Neural Information Processing Systems

We consider the problem of online gradient descent learning for general two-layer neural networks. An analytic solution is presented and used to investigate the role of the learning rate in controlling the evolution and convergence of the learning process. Two-layer networks with an arbitrary number of hidden units have been shown to be universal approximators [1] for such N-to-one dimensional maps. We investigate the emergence of generalization ability in an online learning scenario [2], in which the couplings are modified after the presentation of each example so as to minimize the corresponding error. The resulting changes in {J} are described as a dynamical evolution; the number of examples plays the role of time.


Unsupervised Pixel-prediction

Neural Information Processing Systems

When a sensory system constructs a model of the environment from its input, it might need to verify the model's accuracy. One method of verification is multivariate time-series prediction: a good model could predict the near-future activity of its inputs, much as a good scientific theory predicts future data. Such a predicting model would require copious top-down connections to compare the predictions with the input. That feedback could improve the model's performance in two ways: by biasing internal activity toward expected patterns, and by generating specific error signals if the predictions fail. A proof-of-concept model-an event-driven, computationally efficient layered network, incorporating "cortical" features like all-excitatory synapses and local inhibition-was constructed to make near-future predictions of a simple, moving stimulus. After unsupervised learning, the network contained units not only tuned to obvious features of the stimulus like contour orientation and motion, but also to contour discontinuity ("end-stopping") and illusory contours.


Human Face Detection in Visual Scenes

Neural Information Processing Systems

We present a neural network-based face detection system. A retinally connected neural network examines small windows of an image, and decides whether each window contains a face. The system arbitrates between multiple networks to improve performance over a single network. We use a bootstrap algorithm for training, which adds false detections into the training set as training progresses. This eliminates the difficult task of manually selecting non-face training examples, which must be chosen to span the entire space of non-face images.