Combined Neural Network and Rule-Based Framework for Probabilistic Pattern Recognition and Discovery

Neural Information Processing Systems

A combined neural network and rule-based approach is suggested as a general framework for pattern recognition. This approach enables unsupervised andsupervised learning, respectively, while providing probability estimates for the output classes. The probability maps are utilized for higher level analysis such as a feedback for smoothing over the output label mapsand the identification of unknown patterns (pattern "discovery"). The suggested approach is presented and demonstrated in the texture - analysis task. A correct classification rate in the 90 percentile is achieved for both unstructured and structured natural texture mosaics. The advantages ofthe probabilistic approach to pattern analysis are demonstrated.


Shooting Craps in Search of an Optimal Strategy for Training Connectionist Pattern Classifiers

Neural Information Processing Systems

We compare two strategies for training connectionist (as well as nonconnectionist) modelsfor statistical pattern recognition. The probabilistic strategy is based on the notion that Bayesian discrimination (i.e.- optimal classification) isachieved when the classifier learns the a posteriori class distributions of the random feature vector. The differential strategy is based on the notion that the identity of the largest class a posteriori probability of the feature vector is all that is needed to achieve Bayesian discrimination. Each strategy is directly linked to a family ofobjective functions that can be used in the supervised training procedure. We prove that the probabilistic strategy - linked with error measure objective functions such as mean-squared-error and cross-entropy - typically used to train classifiers necessarily requires larger training sets and more complex classifier architectures than those needed to approximate the Bayesian discriminant function.In contrast.



Tangent Prop - A formalism for specifying selected invariances in an adaptive network

Neural Information Processing Systems

In many machine learning applications, one has access, not only to training data, but also to some high-level a priori knowledge about the desired behavior ofthe system. For example, it is known in advance that the output of a character recognizer should be invariant with respect to small spatial distortionsof the input images (translations, rotations, scale changes, etcetera). We have implemented a scheme that allows a network to learn the derivative ofits outputs with respect to distortion operators of our choosing. This not only reduces the learning time and the amount of training data, but also provides a powerful language for specifying what generalizations we wish the network to perform. 1 INTRODUCTION In machine learning, one very often knows more about the function to be learned than just the training data. An interesting case is when certain directional derivatives ofthe desired function are known at certain points.


A Neural Net Model for Adaptive Control of Saccadic Accuracy by Primate Cerebellum and Brainstem

Neural Information Processing Systems

Accurate saccades require interaction between brainstem circuitry and the cerebeJJum. A model of this interaction is described, based on Kawato's principle of feedback-error-Iearning. In the model a part of the brainstem (the superior colliculus) acts as a simple feedback controJJer with no knowledge of initial eye position, and provides an error signal for the cerebeJJum to correct for eye-muscle nonIinearities. This teaches the cerebeJJum, modelled as a CMAC, to adjust appropriately the gain on the brainstem burst-generator's internal feedback loop and so alter the size of burst sent to the motoneurons. With direction-only errors the system rapidly learns to make accurate horizontal eye movements from any starting position, and adapts realistically to subsequent simulated eye-muscle weakening or displacement of the saccadic target.


Dual Inhibitory Mechanisms for Definition of Receptive Field Characteristics in a Cat Striate Cortex

Neural Information Processing Systems

In single cells of the cat striate cortex, lateral inhibition across orientation and/orspatial frequency is found to enhance preexisting biases. A contrast-dependent but spatially non-selective inhibitory component is also found. Stimulation with ascending and descending contrasts reveals the latter as a response hysteresis that is sensitive, powerful and rapid, suggesting thatit is active in day-to-day vision. Both forms of inhibition are not recurrent but are rather network properties. These findings suggest two fundamental inhibitory mechanisms: a global mechanism that limits dynamic range and creates spatial selectivity through thresholding and a local mechanism that specifically refines spatial filter properties.


A Topographic Product for the Optimization of Self-Organizing Feature Maps

Neural Information Processing Systems

We present a topographic product which measures the preservation of neighborhood relations as a criterion to optimize the output space topology of the map with regard to the global dimensionality DA as well as to the dimensions inthe individual directions. We test the topographic product method not only on synthetic mapping examples, but also on speech data.


Markov Random Fields Can Bridge Levels of Abstraction

Neural Information Processing Systems

Network vision systems must make inferences from evidential information acrosslevels of representational abstraction, from low level invariants, through intermediate scene segments, to high level behaviorally relevant object descriptions. This paper shows that such networks can be realized as Markov Random Fields (MRFs). We show first how to construct an MRF functionally equivalent to a Hough transform parameter network, thus establishing a principled probabilistic basis for visual networks. Second, weshow that these MRF parameter networks are more capable and flexible than traditional methods. In particular, they have a well-defined probabilistic interpretation, intrinsically incorporate feedback, and offer richer representations and decision capabilities.



The Clusteron: Toward a Simple Abstraction for a Complex Neuron

Neural Information Processing Systems

The nature of information processing in complex dendritic trees has remained an open question since the origin of the neuron doctrine 100 years ago. With respect to learning, for example, it is not known whether a neuron is best modeled as 35 36 Mel a pseudo-linear unit, equivalent in power to a simple Perceptron, or as a general nonlinear learning device, equivalent in power to a multi-layered network. In an attempt tocharacterize the input-output behavior of a whole dendritic tree containing voltage-dependent membrane mechanisms, a recent compartmental modeling study in an anatomically reconstructed neocortical pyramidal cell (anatomical data from Douglas et al., 1991; "NEURON" simulation package provided by Michael Hines and John Moore) showed that a dendritic tree rich in NMDA-type synaptic channels isselectively responsive to spatially clustered, as opposed to diffuse, pattens of synaptic activation (Mel, 1992). For example, 100 synapses which were simultaneously activatedat 100 randomly chosen locations about the dendritic arbor were less effective at firing the cell than 100 synapses activated in groups of 5, at each of 20 randomly chosen dendritic locations. The cooperativity among the synapses in each group is due to the voltage dependence of the NMDA channel: Each activated NMDA synapse becomes up to three times more effective at injecting synaptic current whenthe post-synaptic membrane is locally depolarized by 30-40 m V from the resting potential.