Europe
Onset-based Sound Segmentation
A technique for segmenting sounds using processing based on mammalian earlyauditory processing is presented. The technique is based on features in sound which neuron spike recording suggests are detected in the cochlear nucleus. The sound signal is bandpassed andeach signal processed to enhance onsets and offsets. The onset and offset signals are compressed, then clustered both in time and across frequency channels using a network of integrateand-fire neurons.Onsets and offsets are signalled by spikes, and the timing of these spikes used to segment the sound. 1 Background Traditional speech interpretation techniques based on Fourier transforms, spectrum recoding, and a hidden Markov model or neural network interpretation stage have limitations both in continuous speech and in interpreting speech in the presence of noise, and this has led to interest in front ends modelling biological auditory systems for speech interpretation systems (Ainsworth and Meyer 92; Cosi 93; Cole et al 95). Auditory modelling systems use similar early auditory processing to that used in biological systems.
Investment Learning with Hierarchical PSOMs
Walter, Jörg A., Ritter, Helge
We propose a hierarchical scheme for rapid learning of context dependent "skills" that is based on the recently introduced "Parameterized Self Organizing Map" ("PSOM"). The underlying idea is to first invest some learning effort to specialize the system into a rapid learner for a more restricted range of contexts. The specialization is carried out by a prior "investment learning stage", during which the system acquires a set of basis mappings or "skills" for a set of prototypical contexts. Adaptation of a "skill" to a new context can then be achieved by interpolating in the space of the basis mappings and thus can be extremely rapid. We demonstrate the potential of this approach for the task of a 3D visuomotor mapfor a Puma robot and two cameras. This includes the forward and backward robot kinematics in 3D end effector coordinates, the 2D 2D retina coordinates and also the 6D joint angles. After the investment phasethe transformation can be learned for a new camera setup with a single observation.
Improved Gaussian Mixture Density Estimates Using Bayesian Penalty Terms and Network Averaging
We compare two regularization methods which can be used to improve thegeneralization capabilities of Gaussian mixture density estimates. The first method uses a Bayesian prior on the parameter space.We derive EM (Expectation Maximization) update rules which maximize the a posterior parameter probability. In the second approachwe apply ensemble averaging to density estimation. This includes Breiman's "bagging", which recently has been found to produce impressive results for classification networks.
Discovering Structure in Continuous Variables Using Bayesian Networks
Hofmann, Reimar, Tresp, Volker
We study Bayesian networks for continuous variables using nonlinear conditionaldensity estimators. We demonstrate that useful structures can be extracted from a data set in a self-organized way and we present sampling techniques for belief update based on Markov blanket conditional density models. 1 Introduction One of the strongest types of information that can be learned about an unknown process is the discovery of dependencies and -even more important-of independencies. Asuperior example is medical epidemiology where the goal is to find the causes of a disease and exclude factors which are irrelevant.
Some results on convergent unlearning algorithm
Semenov, Serguei A., Shuvalova, Irina B.
In the past years the unsupervised learning schemes arose strong interest among researchers but for the time being a little is known about underlying learning mechanisms, aswell as still less rigorous results like convergence theorems were obtained in this field. One of promising concepts along this line is so called "unlearning" for the Hopfield-type neural networks (Hopfield et ai, 1983, van Hemmen & Klemmer, 1992,Wimbauer et ai, 1994). Elaborating that elegant ideas the convergent unlearning algorithm has recently been proposed (Plakhov & Semenov, 1994), executing withoutpatterns presentation. It is aimed at to correct initial Hebbian connectivity in order to provide extensive storage of arbitrary correlated data. This algorithm is stated as follows. Pick up at iteration step m, m 0,1,2, ... a random network state s(m)
Bayesian Methods for Mixtures of Experts
Waterhouse, Steve R., MacKay, David, Robinson, Anthony J.
Tel: [ 44] 1223 332815 ajr@eng.cam.ac.uk ABSTRACT We present a Bayesian framework for inferring the parameters of a mixture of experts model based on ensemble learning by variational freeenergy minimisation. The Bayesian approach avoids the over-fitting and noise level underestimation problems of traditional maximum likelihood inference. We demonstrate these methods on artificial problems and sunspot time series prediction. INTRODUCTION The task of estimating the parameters of adaptive models such as artificial neural networks using Maximum Likelihood (ML) is well documented ego Geman, Bienenstock & Doursat (1992). ML estimates typically lead to models with high variance, a process known as "over-fitting".
Dynamics of On-Line Gradient Descent Learning for Multilayer Neural Networks
Sollat CONNECT, The Niels Bohr Institute Blegdamsdvej 17 Copenhagen 2100, Denmark Abstract We consider the problem of online gradient descent learning for general two-layer neural networks. An analytic solution is presented andused to investigate the role of the learning rate in controlling theevolution and convergence of the learning process. Two-layer networks with an arbitrary number of hidden units have been shown to be universal approximators [1] for such N-to-one dimensional maps. We investigate the emergence of generalization ability in an online learning scenario [2], in which the couplings are modified after the presentation of each example so as to minimize the corresponding error. The resulting changes in {J} are described as a dynamical evolution; the number of examples plays the role of time.
Modern Analytic Techniques to Solve the Dynamics of Recurrent Neural Networks
Coolen, A.C.C., Laughton, S. N., Sherrington, D.
We describe the use of modern analytical techniques in solving the dynamics of symmetric and nonsymmetric recurrent neural networks nearsaturation. These explicitly take into account the correlations betweenthe post-synaptic potentials, and thereby allow for a reliable prediction of transients. 1 INTRODUCTION Recurrent neural networks have been rather popular in the physics community, because they lend themselves so naturally to analysis with tools from equilibrium statistical mechanics. This was the main theme of physicists between, say, 1985 and 1990. Less familiar to the neural network community is a subsequent wave of theoretical physical studies, dealing with the dynamics of symmetric and nonsymmetric recurrentnetworks. The strategy here is to try to describe the processes at a reduced level of an appropriate small set of dynamic macroscopic observables.
Stable Dynamic Parameter Adaption
A stability criterion for dynamic parameter adaptation is given. In the case of the learning rate of backpropagation, a class of stable algorithms is presented and studied, including a convergence proof. 1 INTRODUCTION All but a few learning algorithms employ one or more parameters that control the quality of learning. Backpropagation has its learning rate and momentum parameter; Boltzmannlearning uses a simulated annealing schedule; Kohonen learning a learning rate and a decay parameter; genetic algorithms probabilities, etc. The investigator always has to set the parameters to specific values when trying to solve a certain problem. Traditionally, the metaproblem of adjusting the parameters is solved by relying on a set of well-tested values of other problems or an intensive search for good parameter regions by restarting the experiment with different values. Inthis situation, a great deal of expertise and/or time for experiment design is required (as well as a huge amount of computing time).
On the Computational Power of Noisy Spiking Neurons
It has remained unknown whether one can in principle carry out reliable digital computations with networks of biologically realistic models for neurons. This article presents rigorous constructions for simulating in real-time arbitrary given boolean circuits and finite automatawith arbitrarily high reliability by networks of noisy spiking neurons. In addition we show that with the help of "shunting inhibition" even networks of very unreliable spiking neurons can simulate in real-time any McCulloch-Pitts neuron (or "threshold gate"), and therefore any multilayer perceptron (or "threshold circuit") in a reliable manner. These constructions provide a possible explanation forthe fact that biological neural systems can carry out quite complex computations within 100 msec. It turns out that the assumption that these constructions require about the shape of the EPSP's and the behaviour of the noise are surprisingly weak. 1 Introduction