Weight Space Probability Densities in Stochastic Learning: I. Dynamics and Equilibria

Neural Information Processing Systems

The ensemble dynamics of stochastic learning algorithms can be studied using theoretical techniques from statistical physics. We develop the equations of motion for the weight space probability densities for stochastic learning algorithms. We discuss equilibria in the diffusion approximation and provide expressions for special cases of the LMS algorithm. The equilibrium densities are not in general thermal (Gibbs) distributions in the objective function being minimized, but rather depend upon an effective potential that includes diffusion effects. Finally we present an exact analytical expression for the time evolution of the density for a learning algorithm with weight updates proportional to the sign of the gradient.


Parameterising Feature Sensitive Cell Formation in Linsker Networks in the Auditory System

Neural Information Processing Systems

This paper examines and extends the work of Linsker (1986) on self organising feature detectors. Linsker concentrates on the visual processing system, but infers that the weak assumptions made will allow the model to be used in the processing of other sensory information. This claim is examined here, with special attention paid to the auditory system, where there is much lower connectivity and therefore more statistical variability. Online training is utilised, to obtain an idea of training times. These are then compared to the time available to prenatal mammals for the formation of feature sensitive cells. 1 INTRODUCTION Within the last thirty years, a great deal of research has been carried out in an attempt to understand the development of cells in the pathways between the sensory apparatus and the cortex in mammals. For example, theories for the development of feature detectors were forwarded by Nass and Cooper (1975), by Grossberg (1976) and more recently Obermayer et al (1990). Hubel and Wiesel (1961) established the existence of several different types of feature sensitive cell in the visual cortex of cats. Various subsequent experiments have 1007 1008 Walton and Bisset shown that a considerable amount of development takes place before birth (i.e.


Word Space

Neural Information Processing Systems

Representations for semantic information about words are necessary for many applications of neural networks in natural language processing. This paper describes an efficient, corpus-based method for inducing distributed semantic representations for a large number of words (50,000) from lexical coccurrence statistics by means of a large-scale linear regression. The representations are successfully applied to word sense disambiguation using a nearest neighbor method. 1 Introduction Many tasks in natural language processing require access to semantic information about lexical items and text segments.


A Note on Learning Vector Quantization

Neural Information Processing Systems

Vector Quantization is useful for data compression. Competitive Learning which minimizes reconstruction error is an appropriate algorithm for vector quantization of unlabelled data. Vector quantization of labelled data for classification has a different objective, to minimize the number of misclassifications, and a different algorithm is appropriate. We show that a variant of Kohonen's LVQ2.1 algorithm can be seen as a multiclass extension of an algorithm which in a restricted 2 class case can be proven to converge to the Bayes optimal classification boundary. We compare the performance of the LVQ2.1 algorithm to that of a modified version having a decreasing window and normalized step size, on a ten class vowel classification problem.



Visual Motion Computation in Analog VLSI Using Pulses

Neural Information Processing Systems

The real time computation of motion from real images using a single chip with integrated sensors is a hard problem. We present two analog VLSI schemes that use pulse domain neuromorphic circuits to compute motion. Pulses of variable width, rather than graded potentials, represent a natural medium for evaluating temporal relationships.


Unsupervised Discrimination of Clustered Data via Optimization of Binary Information Gain

Neural Information Processing Systems

We present the information-theoretic derivation of a learning algorithm that clusters unlabelled data with linear discriminants. In contrast to methods that try to preserve information about the input patterns, we maximize the information gained from observing the output of robust binary discriminators implemented with sigmoid nodes. We deri ve a local weight adaptation rule via gradient ascent in this objective, demonstrate its dynamics on some simple data sets, relate our approach to previous work and suggest directions in which it may be extended.


Integration of Visual and Somatosensory Information for Preshaping Hand in Grasping Movements

Neural Information Processing Systems

The primate brain must solve two important problems in grasping movements. The first problem concerns the recognition of grasped objects: specifically, how does the brain integrate visual and motor information on a grasped object? The second problem concerns hand shape planning: specifically, how does the brain design the hand configuration suited to the shape of the object and the manipulation task? A neural network model that solves these problems has been developed. The operations of the network are divided into a learning phase and an optimization phase. In the learning phase, internal representations, which depend on the grasped objects and the task, are acquired by integrating visual and somatosensory information. In the optimization phase, the most suitable hand shape for grasping an object is determined by using a relaxation computation of the network.


Interposing an ontogenetic model between Genetic Algorithms and Neural Networks

Neural Information Processing Systems

The relationships between learning, development and evolution in Nature is taken seriously, to suggest a model of the developmental process whereby the genotypes manipulated by the Genetic Algorithm (GA) might be expressed to form phenotypic neural networks (NNet) that then go on to learn. ONTOL is a grammar for generating polynomial NN ets for time-series prediction. Genomes correspond to an ordered sequence of ONTOL productions and define a grammar that is expressed to generate a NNet. The NNet's weights are then modified by learning, and the individual's prediction error is used to determine GA fitness. A new gene doubling operator appears critical to the formation of new genetic alternatives in the preliminary but encouraging results presented.


Analog Cochlear Model for Multiresolution Speech Analysis

Neural Information Processing Systems

The tradeoff between time and frequency resolution is viewed as the fundamental difference between conventional spectrographic analysis and cochlear signal processing for broadband, rapid-changing signals. The model's response exhibits a wavelet-like analysis in the scale domain that preserves good temporal resolution; the frequency of each spectral component in a broadband signal can be accurately determined from the interpeak intervals in the instantaneous firing rates of auditory fibers. Such properties of the cochlear model are demonstrated with natural speech and synthetic complex signals. 1 Introduction As a nonparametric tool, spectrogram, or short-term Fourier transform, is widely used in analyzing non-stationary signals, such speech. Usually a window is applied to the running signal and then the Fourier transform is performed. The specific window applied determines the tradeoff between temporal and spectral resolutions of the analysis, as indicated by the uncertainty principle [1].