Coupled Dynamics of Fast Neurons and Slow Interactions

Neural Information Processing Systems

A.C.C. Coolen R.W. Penney D. Sherrington Dept. of Physics - Theoretical Physics University of Oxford 1 Keble Road, Oxford OXI 3NP, U.K. Abstract A simple model of coupled dynamics of fast neurons and slow interactions, modellingself-organization in recurrent neural networks, leads naturally to an effective statistical mechanics characterized by a partition function which is an average over a replicated system. This is reminiscent of the replica trick used to study spin-glasses, but with the difference that the number of replicas has a physical meaningas the ratio of two temperatures and can be varied throughout the whole range of real values. The model has interesting phaseconsequences as a function of varying this ratio and external stimuli, and can be extended to a range of other models. 1 A SIMPLE MODEL WITH FAST DYNAMIC NEURONS AND SLOW DYNAMIC INTERACTIONS As the basic archetypal model we consider a system of Ising spin neurons (J'i E {-I, I}, i E {I, ..., N}, interacting via continuous-valued symmetric interactions, Iij, which themselves evolve in response to the states of the neurons. The neurons are taken to have a stochastic field-alignment dynamics which is fast compared with the evolution rate of the interactions hj, such that on the timescale of Iii-dynamics the neurons are effectively in equilibrium according to a Boltzmann distribution, (1) 447 448 Cooien, Penney, and Sherrington where HVoj}({O"d) JijO"iO"j (2) i j and the subscript {Jij} indicates that the {Jij} are to be considered as quenched variables. In practice, several specific types of dynamics which obey detailed balance lead to the equilibrium distribution (1), such as a Markov process with single-spin flip Glauber dynamics [1]. The second term acts to limit the magnitude of hj; f3 is the characteristic inverse temperature of the interaction system. VNTJij(t) (4) where the effective Hamiltonian 11. ({ hj}) is given by 1 1 We now recognise (4) as having the form of a Langevin equation, so that the equilibrium distributionof the interaction system is given by a Boltzmann form. Z{3 (6) Coupled Dynamics of Fast Neurons and Slow Interactions 449 where n _ /j3. We may use Z as a generating functional to produce thermodynamic averagesof state variables I ( {O"d; {Jij}) in the combined system by adding suitable infinitesimal source terms to the neuron Hamiltonian (2): HP.j}({O"d) In fact, any real n is possible by tuning the ratio between the two {3's. In the formulation presented in this paper n is always nonnegative, but negative values are possible if the Hebbian rule of (3) is replaced by an anti-Hebbian form with (UiO"j) replaced by - (O"iO"j) (the case of negative n is being studied by Mezard and coworkers [7]).


Unsupervised Parallel Feature Extraction from First Principles

Neural Information Processing Systems

EE., Linkoping University S-58183 Linkoping Sweden Abstract We describe a number of learning rules that can be used to train unsupervised parallelfeature extraction systems. The learning rules are derived using gradient ascent of a quality function. We consider anumber of quality functions that are rational functions of higher order moments of the extracted feature values. We show that one system learns the principle components of the correlation matrix.Principal component analysis systems are usually not optimal feature extractors for classification. Therefore we design quality functions which produce feature vectors that support unsupervised classification.The properties of the different systems are compared with the help of different artificially designed datasets and a database consisting of all Munsell color spectra. 1 Introduction There are a number of unsupervised Hebbian learning algorithms (see Oja, 1992 and references therein) that perform some version of the Karhunen-Loeve expansion.


Resolving motion ambiguities

Neural Information Processing Systems

We address the problem of optical flow reconstruction and in particular theproblem of resolving ambiguities near edges. They occur due to (i) the aperture problem and (ii) the occlusion problem, where pixels on both sides of an intensity edge are assigned the same velocity estimates (and confidence). However, these measurements are correct for just one side of the edge (the non occluded one). Our approach is to introduce an uncertamty field with respect to the estimates and confidence measures. We note that the confidence measuresare large at intensity edges and larger at the convex sides of the edges, i.e. inside corners, than at the concave side. We resolve the ambiguities through local interactions via coupled Markov random fields (MRF). The result is the detection of motion for regions of images with large global convexity.


Supervised Learning with Growing Cell Structures

Neural Information Processing Systems

Center positions are continuously updated through soft competitive learning. The width of the radial basis functions is derived from the distance to topological neighbors. During the training the observed error is accumulated locally and used to determine where to insert the next unit. This leads (in case of classification problems) to the placement of units near class borders rather than near frequency peaks as is done by most existing methods. The resulting networks need few training epochs and seem to generalize very well. This is demonstrated by examples.


A Computational Model for Cursive Handwriting Based on the Minimization Principle

Neural Information Processing Systems

We propose a trajectory planning and control theory for continuous movements such as connected cursive handwriting and continuous natural speech. Its hardware is based on our previously proposed forward-inverse-relaxation neural network (Wada & Kawato, 1993). Computationally, its optimization principle is the minimum torquechange criterion.Regarding the representation level, hard constraints satisfied by a trajectory are represented as a set of via-points extracted from a handwritten character. Accordingly, we propose a via-point estimation algorithm that estimates via-points by repeating the trajectory formation of a character and the via-point extraction from the character. In experiments, good quantitative agreement is found between human handwriting data and the trajectories generated by the theory. Finally, we propose a recognition schema based on the movement generation. We show a result in which the recognition schema is applied to the handwritten character recognition and can be extended to the phoneme timing estimation of natural speech. 1 INTRODUCTION In reaching movements, trajectory formation is an ill-posed problem because the hand can move along an infinite number of possible trajectories from the starting to the target point.


A Comparative Study of a Modified Bumptree Neural Network with Radial Basis Function Networks and the Standard Multi Layer Perceptron

Neural Information Processing Systems

Bumptrees are geometric data structures introduced by Omohundro (1991) to provide efficient access to a collection of functions on a Euclidean space of interest. We describe a modified bumptree structure that has been employed as a neural network classifier, and compare its performance on several classification tasks against that of radial basis function networks and the standard mutIi-Iayer perceptron. 1 INTRODUCTION A number of neural network studies have demonstrated the utility of the multi-layer perceptron (MLP) and shown it to be a highly effective paradigm. Studies have also shown, however, that the MLP is not without its problems, in particular it requires an extensive training time, is susceptible to local minima problems and its perfonnance is dependent upon its internal network architecture. In an attempt to improve upon the generalisation performance and computational efficiency a number of studies have been undertaken principally concerned with investigating the parametrisation of the MLP. It is well known, for example, that the generalisation performance of the MLP is affected by the number of hidden units in the network, which have to be determined empirically since theory provides no guidance.


Autoencoders, Minimum Description Length and Helmholtz Free Energy

Neural Information Processing Systems

An autoencoder network uses a set of recognition weights to convert an input vector into a code vector. It then uses a set of generative weights to convert the code vector into an approximate reconstruction of the input vector. We derive an objective function for training autoencoders based on the Minimum Description Length (MDL) principle. The aim is to minimize the information required to describe both the code vector and the reconstruction error. We show that this information is minimized by choosing code vectors stochastically according to a Boltzmann distribution, wherethe generative weights define the energy of each possible code vector given the input vector. Unfortunately, if the code vectors use distributed representations, it is exponentially expensive to compute this Boltzmann distribution because it involves all possible code vectors. We show that the recognition weights of an autoencoder can be used to compute an approximation to the Boltzmann distribution and that this approximation givesan upper bound on the description length. Even when this bound is poor, it can be used as a Lyapunov function for learning both the generative and the recognition weights. We demonstrate that this approach can be used to learn factorial codes.


Dual Mechanisms for Neural Binding and Segmentation

Neural Information Processing Systems

We propose that the binding and segmentation of visual features is mediated by two complementary mechanisms; a low resolution, spatial-based,resource-free process and a high resolution, temporal-based, resource-limited process. In the visual cortex, the former depends upon the orderly topographic organization in striate andextrastriate areas while the latter may be related to observed temporalrelationships between neuronal activities. Computer simulations illustrate the role the two mechanisms play in figure/ ground discrimination, depth-from-occlusion, and the vividness ofperceptual completion. 1 COMPLEMENTARY BINDING MECHANISMS The "binding problem" is a classic problem in computational neuroscience which considers how neuronal activities are grouped to create mental representations. For the case of visual processing, the binding of neuronal activities requires a mechanism forselectively grouping fragmented visual features in order to construct the coherent representations (i.e.


Convergence of Indirect Adaptive Asynchronous Value Iteration Algorithms

Neural Information Processing Systems

Reinforcement Learning methods based on approximating dynamic programming (DP) are receiving increased attention due to their utility in forming reactive control policies for systems embedded in dynamic environments. Environments are usually modeled as controlled Markov processes, but when the environment model is not known a priori, adaptive methods are necessary. Adaptive control methodsare often classified as being direct or indirect. Direct methods directly adapt the control policy from experience, whereas indirect methods adapt a model of the controlled process and compute controlpolicies based on the latest model. Our focus is on indirect adaptive DPbased methods in this paper. We present a convergence result for indirect adaptive asynchronous value iteration algorithmsfor the case in which a lookup table is used to store the value function. Our result implies convergence of several existing reinforcementlearning algorithms such as adaptive real-time dynamic programming (ARTDP) (Barto, Bradtke, & Singh, 1993) and prioritized sweeping (Moore & Atkeson, 1993). Although the emphasis of researchers studying DPbased reinforcement learning has been on direct adaptive methods such as Q-Learning (Watkins, 1989) and methods using TD algorithms (Sutton, 1988), it is not clear that these direct methods are preferable in practice to indirect methods such as those analyzed in this paper.


Supervised learning from incomplete data via an EM approach

Neural Information Processing Systems

Real-world learning tasks may involve high-dimensional data sets with arbitrary patterns of missing data. In this paper we present a framework based on maximum likelihood density estimation for learning from such data set.s. VVe use mixture models for the density estimatesand make two distinct appeals to the Expectation Maximization (EM) principle (Dempster et al., 1977) in deriving a learning algorithm-EM is used both for the estimation of mixture componentsand for coping wit.h missing dat.a. The resulting algorithm is applicable t.o a wide range of supervised as well as unsupervised learning problems.