Goto

Collaborating Authors

 Country


Extracting Rules from Artificial Neural Networks with Distributed Representations

Neural Information Processing Systems

Although artificial neural networks have been applied in a variety of real-world scenarios with remarkable success, they have often been criticized for exhibiting a low degree of human comprehensibility. Techniques that compile compact sets of symbolic rules out of artificial neural networks offer a promising perspective to overcome this obvious deficiency of neural network representations. This paper presents an approach to the extraction of if-then rules from artificial neural networks.Its key mechanism is validity interval analysis, which is a generic tool for extracting symbolic knowledge by propagating rule-like knowledge through Backpropagation-style neural networks. Empirical studies in a robot arm domain illustrate theappropriateness of the proposed method for extracting rules from networks with real-valued and distributed representations.


Dynamic Cell Structures

Neural Information Processing Systems

Dynamic Cell Structures (DCS) represent a family of artificial neural architectures suited both for unsupervised and supervised learning. They belong to the recently [Martinetz94] introduced class of Topology Representing Networks (TRN) which build perlectly topology preserving featuremaps. DCS empI'oy a modified Kohonen learning rule in conjunction with competitive Hebbian learning. The Kohonen type learning rule serves to adjust the synaptic weight vectors while Hebbian learning establishes a dynamic lateral connection structure between the units reflecting the topology of the feature manifold.


A solvable connectionist model of immediate recall of ordered lists

Neural Information Processing Systems

A model of short-term memory for serially ordered lists of verbal stimuli is proposed as an implementation of the'articulatory loop' thought to mediate this type of memory (Baddeley, 1986). The model predicts the presence of a repeatable time-varying'context' signal coding the timing of items' presentation in addition to a store of phonological information and a process of serial rehearsal. Items are associated with context nodes and phonemes by Hebbian connections showing both short and long term plasticity. Items are activated by phonemic input during presentation and reactivated by context and phonemic feedback during output. Serial selection of items occurs via a winner-take-all interaction amongst items, with the winner subsequently receiving decaying inhibition. An approximate analysis of error probabilities due to Gaussian noise during output is presented. The model provides an explanatory account of the probability of error as a function of serial position, list length, word length, phonemic similarity, temporal grouping, item and list familiarity, and is proposed as the starting point for a model of rehearsal and vocabulary acquisition.


Direction Selectivity In Primary Visual Cortex Using Massive Intracortical Connections

Neural Information Processing Systems

Almost all models of orientation and direction selectivity in visual cortex are based on feedforward connection schemes, where geniculate inputprovides all excitation to both pyramidal and inhibitory neurons. The latter neurons then suppress the response of the former fornon-optimal stimuli. However, anatomical studies show that up to 90 % of the excitatory synaptic input onto any cortical cellis provided by other cortical cells. The massive excitatory feedback nature of cortical circuits is embedded in the canonical microcircuit of Douglas &. Martin (1991). We here investigate analytically andthrough biologically realistic simulations the functioning of a detailed model of this circuitry, operating in a hysteretic mode. In the model, weak geniculate input is dramatically amplified byintracortical excitation, while inhibition has a dual role: (i) to prevent the early geniculate-induced excitation in the null direction and(ii) to restrain excitation and ensure that the neurons fire only when the stimulus is in their receptive-field.


Plasticity-Mediated Competitive Learning

Neural Information Processing Systems

Differentiation between the nodes of a competitive learning network isconventionally achieved through competition on the basis of neural activity. Simple inhibitory mechanisms are limited to sparse representations, while decorrelation and factorization schemes that support distributed representations are computationally unattractive.By letting neural plasticity mediate the competitive interactioninstead, we obtain diffuse, nonadaptive alternatives forfully distributed representations. We use this technique to Simplify and improve our binary information gain optimization algorithmfor feature extraction (Schraudolph and Sejnowski, 1993); the same approach could be used to improve other learning algorithms. 1 INTRODUCTION Unsupervised neural networks frequently employ sets of nodes or subnetworks with identical architecture and objective function. Some form of competitive interaction isthen needed for these nodes to differentiate and efficiently complement each other in their task.


Using Voice Transformations to Create Additional Training Talkers for Word Spotting

Neural Information Processing Systems

Lack of training data has always been a constraint in training speech recognizers. This research presentsa voice transformation technique which increases the variety among training talkers. The resulting more varied training set provided up to 2.9 percentage points of improvement in the figure of merit (average detection rate) of a high performance word spotter. This improvement is similar to the increase in performance provided by doubling the amount of training data (Carlson, 1994). This technique can also be applied to other speech recognition systems such as continuous speech recognition, talker identification, and isolated speech recognition.


Inferring Ground Truth from Subjective Labelling of Venus Images

Neural Information Processing Systems

In practical situations, experts may visually examine the images and provide a subjective noisy estimate of the truth. Calibrating the reliability and bias of expert labellers is a nontrivial problem. In this paper we discuss some of our recent work on this topic in the context of detecting small volcanoes in Magellan SAR images of Venus. Empirical results (using the Expectation-Maximization procedure) suggest that accounting for subjective noise can be quite significant interms of quantifying both human and algorithm detection performance.


Non-linear Prediction of Acoustic Vectors Using Hierarchical Mixtures of Experts

Neural Information Processing Systems

We are concerned in this paper with the application of multiple models, specifically theHierarchical Mixtures of Experts, to time series prediction, specifically the problem of predicting acoustic vectors for use in speech coding. There have been a number of applications of multiple models in time series prediction. A classic example is the Threshold Autoregressive model (TAR) which was used by Tong & 836 S.R. Waterhouse, A. J. Robinson Lim (1980) to predict sunspot activity. More recently, Lewis, Kay and Stevens (in Weigend & Gershenfeld (1994)) describe the use of Multivariate and Regression Splines(MARS) to the prediction of future values of currency exchange rates. Finally, in speech prediction, Cuperman & Gersho (1985) describe the Switched Inter-frame Vector Prediction (SIVP) method which switches between separate linear predictorstrained on different statistical classes of speech.


Patterns of damage in neural networks: The effects of lesion area, shape and number

Neural Information Processing Systems

Understanding the response of neural nets to structural/functional damage is important fora variety of reasons, e.g., in assessing the performance of neural network hardware, and in gaining understanding of the mechanisms underlying neurological andpsychiatric disorders. Recently, there has been a growing interest in constructing neuralmodels to study how specific pathological neuroanatomical and neurophysiological changes can result in various clinical manifestations, and to investigate thefunctional organization of the symptoms that result from specific brain pathologies (reviewed in [1, 2]). In the area of associative memory models specifically, earlystudies found an increase in memory impairment with increasing lesion severity (in accordance with Lashley's classical'mass action' principle), and showed that slowly developing lesions have less pronounced effects than equivalent acute lesions [3]. Recently, it was shown that the gradual pattern of clinical deterioration manifested in the majority of Alzheimer's patients can be accounted for, and that different synaptic compensation rates can account for the observed variation in the severity and progression rate of this disease [4]. However, this past work is limited in that model elements have no spatial relationships to one another (all elements are conceptually equidistant).


Convergence Properties of the K-Means Algorithms

Neural Information Processing Systems

K-Means is a popular clustering algorithm used in many applications, including the initialization of more computationally expensive algorithms (Gaussian mixtures, Radial Basis Functions, Learning Vector Quantization and some Hidden Markov Models). The practice of this initialization procedure often gives the frustrating feeling that K-Means performs most of the task in a small fraction of the overall time. This motivated us to better understand this convergence speed. A second reason lies in the traditional debate between hard threshold (e.g.