Technology
Template-Based Algorithms for Connectionist Rule Extraction
Alexander, Jay A., Mozer, Michael C.
Casting neural network weights in symbolic terms is crucial for interpreting and explaining the behavior of a network. Additionally, in some domains, a symbolic description may lead to more robust generalization. We present a principled approach to symbolic rule extraction based on the notion of weight templates, parameterized regions of weight space corresponding to specific symbolic expressions. With an appropriate choice of representation, we show how template parameters may be efficiently identified and instantiated to yield the optimal match to a unit's actual weights.
Limits on Learning Machine Accuracy Imposed by Data Quality
Cortes, Corinna, Jackel, L. D., Chiang, Wan-Ping
Random errors and insufficiencies in databases limit the performance of any classifier trained from and applied to the database. In this paper we propose a method to estimate the limiting performance of classifiers imposed by the database. We demonstrate this technique on the task of predicting failure in telecommunication paths. 1 Introduction Data collection for a classification or regression task is prone to random errors, e.g.
PCA-Pyramids for Image Compression
This paper presents a new method for image compression by neural networks. First, we show that we can use neural networks in a pyramidal framework, yielding the so-called PCA pyramids. Then we present an image compression method based on the PCA pyramid, which is similar to the Laplace pyramid and wavelet transform. Some experimental results with real images are reported. Finally, we present a method to combine the quantization step with the learning of the PCA pyramid. 1 Introduction In the past few years, a lot of work has been done on using neural networks for image compression, d. e.g.
Direction Selectivity In Primary Visual Cortex Using Massive Intracortical Connections
Suarez, Humbert, Koch, Christof, Douglas, Rodney
Almost all models of orientation and direction selectivity in visual cortex are based on feedforward connection schemes, where geniculate input provides all excitation to both pyramidal and inhibitory neurons. The latter neurons then suppress the response of the former for non-optimal stimuli. However, anatomical studies show that up to 90 % of the excitatory synaptic input onto any cortical cell is provided by other cortical cells. The massive excitatory feedback nature of cortical circuits is embedded in the canonical microcircuit of Douglas &. Martin (1991). We here investigate analytically and through biologically realistic simulations the functioning of a detailed model of this circuitry, operating in a hysteretic mode. In the model, weak geniculate input is dramatically amplified by intracortical excitation, while inhibition has a dual role: (i) to prevent the early geniculate-induced excitation in the null direction and (ii) to restrain excitation and ensure that the neurons fire only when the stimulus is in their receptive-field.
Using a neural net to instantiate a deformable model
Williams, Christopher K. I., Revow, Michael, Hinton, Geoffrey E.
Deformable models are an attractive approach to recognizing nonrigid objects which have considerable within class variability. However, there are severe search problems associated with fitting the models to data. We show that by using neural networks to provide better starting points, the search time can be significantly reduced. The method is demonstrated on a character recognition task. In previous work we have developed an approach to handwritten character recognition based on the use of deformable models (Hinton, Williams and Revow, 1992a; Revow, Williams and Hinton, 1993). We have obtained good performance with this method, but a major problem is that the search procedure for fitting each model to an image is very computationally intensive, because there is no efficient algorithm (like dynamic programming) for this task. In this paper we demonstrate that it is possible to "compile down" some of the knowledge gained while fitting models to data to obtain better starting points that significantly reduce the search time. 1 DEFORMABLE MODELS FOR DIGIT RECOGNITION The basic idea in using deformable models for digit recognition is that each digit has a model, and a test image is classified by finding the model which is most likely to have generated it. The quality of the match between model and test image depends on the deformation of the model, the amount of ink that is attributed to noise and the distance of the remaining ink from the deformed model.
Using Voice Transformations to Create Additional Training Talkers for Word Spotting
Chang, Eric I., Lippmann, Richard P.
Lack of training data has always been a constraint in training speech recognizers. This research presents a voice transformation technique which increases the variety among training talkers. The resulting more varied training set provided up to 2.9 percentage points of improvement in the figure of merit (average detection rate) of a high performance word spotter. This improvement is similar to the increase in performance provided by doubling the amount of training data (Carlson, 1994). This technique can also be applied to other speech recognition systems such as continuous speech recognition, talker identification, and isolated speech recognition.
A Silicon Axon
Minch, Bradley A., Hasler, Paul E., Diorio, Chris, Mead, Carver
It is well known that axons are neural processes specialized for transmitting information over relatively long distances in the nervous system. Impulsive electrical disturbances known as action potentials are normally initiated near the cell body of a neuron when the voltage across the cell membrane crosses a threshold. These pulses are then propagated with a fairly stereotypical shape at a more or less constant velocity down the length of the axon. Consequently, axons excel at precisely preserving the relative timing of threshold crossing events but do not preserve any of the initial signal shape. Information, then, is presumably encoded in the relative timing of action potentials.
Using a Saliency Map for Active Spatial Selective Attention: Implementation & Initial Results
Baluja, Shumeet, Pomerleau, Dean A.
School of Computer Science School of Computer Science Carnegie Mellon University Carnegie Mellon University Pittsburgh, PA 15213 Pittsburgh, PA 15213 Abstract In many vision based tasks, the ability to focus attention on the important portions of a scene is crucial for good performance on the tasks. In this paper we present a simple method of achieving spatial selective attention through the use of a saliency map. The saliency map indicates which regions of the input retina are important for performing the task. The saliency map is created through predictive auto-encoding. The performance of this method is demonstrated on two simple tasks which have multiple very strong distracting features in the input retina. Architectural extensions and application directions for this model are presented. On some tasks this extra input can easily be ignored. Nonetheless, often the similarity between the important input features and the irrelevant features is great enough to interfere with task performance.
Combining Estimators Using Non-Constant Weighting Functions
Tresp, Volker, Taniguchi, Michiaki
This paper discusses the linearly weighted combination of estimators in which the weighting functions are dependent on the input. We show that the weighting functions can be derived either by evaluating the input dependent variance of each estimator or by estimating how likely it is that a given estimator has seen data in the region of the input space close to the input pattern. The latter solution is closely related to the mixture of experts approach and we show how learning rules for the mixture of experts can be derived from the theory about learning with missing features. The presented approaches are modular since the weighting functions can easily be modified (no retraining) if more estimators are added. Furthermore, it is easy to incorporate estimators which were not derived from data such as expert systems or algorithms.