Information Measure Based Skeletonisation

Neural Information Processing Systems

Automatic determination of proper neural network topology by trimming oversized networks is an important area of study, which has previously been addressed using a variety of techniques. In this paper, we present Information Measure Based Skeletonisation (IMBS), a new approach to this problem where superfluous hidden units are removed based on their information measure (1M). This measure, borrowed from decision tree induction techniques, reflects the degree to which the hyperplane formed by a hidden unit discriminates between training data classes. We show the results of applying IMBS to three classification tasks and demonstrate that it removes a substantial number of hidden units without significantly affecting network performance.


Active Exploration in Dynamic Environments

Neural Information Processing Systems

Many real-valued connectionist approaches to learning control realize exploration by randomness in action selection. This might be disadvantageous when costs are assigned to "negative experiences". The basic idea presented in this paper is to make an agent explore unknown regions in a more directed manner. This is achieved by a so-called competence map, which is trained to predict the controller's accuracy, and is used for guiding exploration. Based on this, a bistable system enables smoothly switching attention between two behaviors - exploration and exploitation - depending on expected costs and knowledge gain. The appropriateness of this method is demonstrated by a simple robot navigation task.


Perturbing Hebbian Rules

Neural Information Processing Systems

Feedforward networks composed of units which compute a sigmoidal function of a weighted sum of their inputs have been much investigated. We tested the approximation and estimation capabilities of networks using functions more complex than sigmoids. Three classes of functions were tested: polynomials, rational functions, and flexible Fourier series. Unlike sigmoids, these classes can fit nonmonotonic functions. They were compared on three problems: prediction of Boston housing prices, the sunspot count, and robot arm inverse dynamics. The complex units attained clearly superior performance on the robot arm problem, which is a highly nonmonotonic, pure approximation problem. On the noisy and only mildly nonlinear Boston housing and sunspot problems, differences among the complex units were revealed; polynomials did poorly, whereas rationals and flexible Fourier series were comparable to sigmoids. 1 Introduction


A Connectionist Learning Approach to Analyzing Linguistic Stress

Neural Information Processing Systems

We use connectionist modeling to develop an analysis of stress systems in terms of ease of learnability. In traditional linguistic analyses, learnability arguments determine default parameter settings based on the feasibilty of logicall y deducing correct settings from an initial state. Our approach provides an empirical alternative to such arguments. Based on perceptron learning experiments using data from nineteen human languages, we develop a novel characterization of stress patterns in terms of six parameters. These provide both a partial description of the stress pattern itself and a prediction of its learnability, without invoking abstract theoretical constructs such as metrical feet. This work demonstrates that machine learning methods can provide a fresh approach to understanding linguistic phenomena.


Optical Implementation of a Self-Organizing Feature Extractor

Neural Information Processing Systems

We demonstrate a self-organizing system based on photorefractive ring oscillators. We employ the system in two ways that can both be thought of as feature extractors; one acts on a set of images exposed repeatedly to the system strictly as a linear feature extractor, and the other serves as a signal demultiplexer for fiber optic communications. Both systems implement unsupervised competitive learning embedded within the mode interaction dynamics between the modes of a set of ring oscillators. After a training period, the modes of the rings become associated with the different image features or carrier frequencies within the incoming data stream.


Learning in Feedforward Networks with Nonsmooth Functions

Neural Information Processing Systems

This paper is concerned with the problem of learning in networks where some or all of the functions involved are not smooth. Examples of such networks are those whose neural transfer functions are piecewise-linear and those whose error function is defined in terms of the 100 norm. Up to now, networks whose neural transfer functions are piecewise-linear have received very little consideration in the literature, but the possibility of using an error function defined in terms of the 100 norm has received some attention. In this paper we draw upon some recent results from the field of nonsmooth optimization (NSO) to present an algorithm for the non smooth case. Our motivation for this work arose out of the fact that we have been able to show that, in backpropagation, an error function based upon the 100 norm overcomes the difficulties which can occur when using the 12 norm. 1 INTRODUCTION This paper is concerned with the problem of learning in networks where some or all of the functions involved are not smooth.


Statistical Reliability of a Blowfly Movement-Sensitive Neuron

Neural Information Processing Systems

We develop a model-independent method for characterizing the reliability of neural responses to brief stimuli. This approach allows us to measure the discriminability of similar stimuli, based on the real-time response of a single neuron. Neurophysiological data were obtained from a movementsensitive neuron (HI) in the visual system of the blowfly Calliphom erythrocephala. Furthermore, recordings were made from blowfly photoreceptor cells to quantify the signal to noise ratios in the peripheral visual system. As photoreceptors form the input to the visual system, the reliability of their signals ultimately determines the reliability of any visual discrimination task. For the case of movement detection, this limit can be computed, and compared to the HI neuron's reliability. Under favorable conditions, the performance of the HI neuron closely approaches the theoretical limit, which means that under these conditions the nervous system adds little noise in the process of computing movement from the correlations of signals in the photoreceptor array.


Competitive Anti-Hebbian Learning of Invariants

Neural Information Processing Systems

Although the detection of invariant structure in a given set of input patterns is vital to many recognition tasks, connectionist learning rules tend to focus on directions of high variance (principal components). The prediction paradigm is often used to reconcile this dichotomy; here we suggest a more direct approach to invariant learning based on an anti-Hebbian learning rule. An unsupervised tWO-layer network implementing this method in a competitive setting learns to extract coherent depth information from random-dot stereograms. 1 INTRODUCTION: LEARNING INVARIANT STRUCTURE Many connectionist learning algorithms share with principal component analysis (Jolliffe, 1986) the strategy of extracting the directions of highest variance from the input. A single Hebbian neuron, for instance, will come to encode the input's first principal component (Oja and Karhunen, 1985); various forms of lateral interaction can be used to force a layer of such nodes to differentiate and span the principal component subspace - cf. (Sanger, 1989; Kung, 1990; Leen, 1991), and others. The same type of representation also develops in the hidden layer of backpropagation autoassociator networks (Baldi and Hornik, 1989).



Human and Machine 'Quick Modeling'

Neural Information Processing Systems

We present here an interesting experiment in'quick modeling' by humans, performed independently on small samples, in several languages and two continents, over the last three years. Comparisons to decision tree procedures and neural net processing are given. From these, we conjecture that human reasoning is better represented by the latter, but substantially different from both. Implications for the'strong convergence hypothesis' between neural networks and machine learning are discussed, now expanded to include human reasoning comparisons. 1 INTRODUCTION Until recently the fields of symbolic and connectionist learning evolved separately. Suddenly in the last two years a significant number of papers comparing the two methodologies have appeared. A beginning synthesis of these two fields was forged at the NIPS '90 Workshop #5 last year (Pratt and Norton, 1990), where one may find a good bibliography of the recent work of Atlas, Dietterich, Omohundro, Sanger, Shavlik, Tsoi, Utgoff and others. It was at that NIPS '90 Workshop that we learned of these studies, most of which concentrate on performance comparisons of decision tree algorithms (such as ID3, CART) and neural net algorithms (such as Perceptrons, Backpropagation). Independently three years ago we had looked at Quinlan's ID3 scheme (Quinlan, 1984) and intuitively and rather instantly not agreeing with the generalization he obtains by ID3 from a sample of 8 items generalized to 12 items, we subjected this example to a variety of human experiments. We report our findings, as compared to the performance of ID3 and also to various neural net computations.