Goto

Collaborating Authors

 Technology


From Isolation to Cooperation: An Alternative View of a System of Experts

Neural Information Processing Systems

We introduce a constructive, incremental learning system for regression problems that models data by means of locally linear experts. In contrast to other approaches, the experts are trained independently and do not compete for data during learning. Only when a prediction for a query is required do the experts cooperate by blending their individual predictions. Eachexpert is trained by minimizing a penalized local cross validation errorusing second order methods. In this way, an expert is able to find a local distance metric by adjusting the size and shape of the receptive fieldin which its predictions are valid, and also to detect relevant input features by adjusting its bias on the importance of individual input dimensions. We derive asymptotic results for our method. In a variety of simulations the properties of the algorithm are demonstrated with respect to interference, learning speed, prediction accuracy, feature detection, and task oriented incremental learning.


Discovering Structure in Continuous Variables Using Bayesian Networks

Neural Information Processing Systems

We study Bayesian networks for continuous variables using nonlinear conditionaldensity estimators. We demonstrate that useful structures can be extracted from a data set in a self-organized way and we present sampling techniques for belief update based on Markov blanket conditional density models. 1 Introduction One of the strongest types of information that can be learned about an unknown process is the discovery of dependencies and -even more important-of independencies. Asuperior example is medical epidemiology where the goal is to find the causes of a disease and exclude factors which are irrelevant.


Learning Fine Motion by Markov Mixtures of Experts

Neural Information Processing Systems

Compliant control is a standard method for performing fine manipulation tasks,like grasping and assembly, but it requires estimation of the state of contact (s.o.c.) between the robot arm and the objects involved.Here we present a method to learn a model of the movement from measured data. The method requires little or no prior knowledge and the resulting model explicitly estimates the s.o.c. The current s.o.c. is viewed as the hidden state variable of a discrete HMM. The control dependent transition probabilities between states are modeled as parametrized functions of the measurement. Weshow that their parameters can be estimated from measurements at the same time as the parameters of the movement in each s.o.c.


A Neural Network Classifier for the I100 OCR Chip

Neural Information Processing Systems

Therefore, we want c to be less than 0.5. In order to get a 2:1 margin, we choose c 0.25. The classifier is trained only on individual partial characters instead of all possible combinations of partial characters. Therefore, we can specify the classifier using only 1523 constraints, instead of creating a training set of approximately 128,000 possible combinations of partial characters. Applying these constraints is therefore much faster than back-propagation on the entire data set.


Silicon Models for Auditory Scene Analysis

Neural Information Processing Systems

We are developing special-purpose, low-power analog-to-digital converters for speech and music applications, that feature analog circuit models of biological audition to process the audio signal before conversion. This paper describes our most recent converter design, and a working system that uses several copies ofthe chip to compute multiple representations of sound from an analog input. This multi-representation system demonstrates the plausibility of inexpensively implementing an auditory scene analysis approach to sound processing. 1. INTRODUCTION The visual system computes multiple representations of the retinal image, such as motion, orientation, and stereopsis, as an early step in scene analysis. Likewise, the auditory brainstem computes secondary representations of sound, emphasizing properties such as binaural disparity, periodicity, and temporal onsets. Recent research in auditory scene analysis involves using computational models of these auditory brainstem representations in engineering applications. Computation is a major limitation in auditory scene analysis research: the complete auditoryprocessing system described in (Brown and Cooke, 1994) operates at approximately 4000 times real time, running under UNIX on a Sun SPARCstation 1. Standard approaches to hardware acceleration for signal processing algorithms could be used to ease this computational burden in a research environment; a variety of parallel, fixed-point hardware products would work well on these algorithms.


A Model of Auditory Streaming

Neural Information Processing Systems

The formation of associations between signals, which are considered to arise from the same external source, allows the organism to recognise significant patterns and relationships within the signals from each source without being confused by accidental coincidences between unrelated signals (Bregman, 1990). The intrinsically temporal nature of sound means that in addition to being able to focus on the signal of interest, perhaps of equal significance, is the ability to predict how that signal is expected to progress; such expectations can then be used to facilitate further processing of the signal. It is important to remember that perception is a creative act (Luria, 1980). The organism creates its interpretation of the world in response to the current stimuli, within the context of its current state of alertness, attention, and previous experience. The creative aspects of perception are exemplified in the auditory system where peripheral processing decomposes acoustic stimuli.


Experiments with Neural Networks for Real Time Implementation of Control

Neural Information Processing Systems

This paper describes a neural network based controller for allocating capacity in a telecommunications network. This system was proposed in order to overcome a "real time" response constraint. Two basic architectures are evaluated: 1) a feedforward network-heuristic and; 2) a feedforward network-recurrent network. These architectures are compared against a linear programming (LP) optimiser as a benchmark. This LP optimiser was also used as a teacher to label the data samples for the feedforward neural network training algorithm. It is found that the systems are able to provide a traffic throughput of 99% and 95%, respectively, of the throughput obtained by the linear programming solution. Once trained, the neural network based solutions are found in a fraction of the time required by the LP optimiser.


Memory-based Stochastic Optimization

Neural Information Processing Systems

In this paper we introduce new algorithms for optimizing noisy plants in which each experiment is very expensive. The algorithms build a global nonlinear model of the expected output at the same time as using Bayesian linear regression analysis of locally weighted polynomial models. The local model answers queries about confidence, noise,gradient and Hessians, and use them to make automated decisions similar to those made by a practitioner of Response Surface Methodology. The global and local models are combined naturally as a locally weighted regression. We examine the question ofwhether the global model can really help optimization, and we extend it to the case of time-varying functions. We compare the new algorithms with a highly tuned higher-order stochastic optimization algorithmon randomly-generated functions and a simulated manufacturing task. We note significant improvements in total regret, time to converge, and final solution quality. 1 INTRODUCTION In a stochastic optimization problem, noisy samples are taken from a plant. A sample consists of a chosen control u (a vector ofreal numbers) and a noisy observed response y.


Temporal Difference Learning in Continuous Time and Space

Neural Information Processing Systems

Elucidation of the relationship between TD learning and dynamic programming (DP) has provided good theoretical insights (Barto et al., 1995). However, conventional TD algorithms were based on discrete-time, discrete-state formulations. In applying these algorithms to control problems, time, space and action had to be appropriately discretized using a priori knowledge or by trial and error. Furthermore, when a TD algorithm is used for neurobiological modeling, discrete-time operation is often very unnatural. There have been several attempts to extend TD-like algorithms to continuous cases. Bradtke et al. (1994) showed convergence results for DPbased algorithms for a discrete-time, continuous-state linear system with a quadratic cost. Bradtke and Duff (1995) derived TD-like algorithms for continuous-time, discrete-state systems (semi-Markov decision problems). Baird (1993) proposed the "advantage updating" algorithm by modifying Q-Iearning so that it works with arbitrary small time steps.


Dynamics of On-Line Gradient Descent Learning for Multilayer Neural Networks

Neural Information Processing Systems

Sollat CONNECT, The Niels Bohr Institute Blegdamsdvej 17 Copenhagen 2100, Denmark Abstract We consider the problem of online gradient descent learning for general two-layer neural networks. An analytic solution is presented andused to investigate the role of the learning rate in controlling theevolution and convergence of the learning process. Two-layer networks with an arbitrary number of hidden units have been shown to be universal approximators [1] for such N-to-one dimensional maps. We investigate the emergence of generalization ability in an online learning scenario [2], in which the couplings are modified after the presentation of each example so as to minimize the corresponding error. The resulting changes in {J} are described as a dynamical evolution; the number of examples plays the role of time.