Goto

Collaborating Authors

 Country


Holographic Recurrent Networks

Neural Information Processing Systems

Holographic Recurrent Networks (HRNs) are recurrent networks which incorporate associative memory techniques for storing sequential structure. HRNs can be easily and quickly trained using gradient descent techniques to generate sequences of discrete outputs and trajectories through continuous spaee. The performance of HRNs is found to be superior to that of ordinary recurrent networks on these sequence generation tasks.


The Computation of Stereo Disparity for Transparent and for Opaque Surfaces

Neural Information Processing Systems

The classical computational model for stereo vision incorporates a uniqueness inhibition constraint to enforce a one-to-one feature match, thereby sacrificing the ability to handle transparency. Critics of the model disregard the uniqueness constraint and argue that the smoothness constraint can provide the excitation support required for transparency computation. However, this modification fails in neighborhoods with sparse features. We propose a Bayesian approach to stereo vision with priors favoring cohesive over transparent surfaces. The disparity and its segmentation into a multi-layer "depth planes" representation are simultaneously computed. The smoothness constraint propagates support within each layer, providing mutual excitation for non-neighboring transparent or partially occluded regions. Test results for various random-dot and other stereograms are presented.


Second order derivatives for network pruning: Optimal Brain Surgeon

Neural Information Processing Systems

We investigate the use of information from all second order derivatives of the error function to perfonn network pruning (i.e., removing unimportant weights from a trained network) in order to improve generalization, simplify networks, reduce hardware or storage requirements, increase the speed of further training, and in some cases enable rule extraction. Our method, Optimal Brain Surgeon (OBS), is Significantly better than magnitude-based methods and Optimal Brain Damage [Le Cun, Denker and Sol1a, 1990], which often remove the wrong weights. OBS permits the pruning of more weights than other methods (for the same error on the training set), and thus yields better generalization on test data. Crucial to OBS is a recursion relation for calculating the inverse Hessian matrix HI from training data and structural information of the net. OBS permits a 90%, a 76%, and a 62% reduction in weights over backpropagation with weighL decay on three benchmark MONK's problems [Thrun et aI., 1991]. Of OBS, Optimal Brain Damage, and magnitude-based methods, only OBS deletes the correct weights from a trained XOR network in every case. Finally, whereas Sejnowski and Rosenberg [1987J used 18,000 weights in their NETtalk network, we used OBS to prune a network to just 1560 weights, yielding better generalization.


Probability Estimation from a Database Using a Gibbs Energy Model

Neural Information Processing Systems

We present an algorithm for creating a neural network which produces accurate probability estimates as outputs. The network implements a Gibbs probability distribution model of the training database. This model is created by a new transformation relating the joint probabilities of attributes in the database to the weights (Gibbs potentials) of the distributed network model. The theory of this transformation is presented together with experimental results. One advantage of this approach is the network weights are prescribed without iterative gradient descent. Used as a classifier the network tied or outperformed published results on a variety of databases.


Optimal Depth Neural Networks for Multiplication and Related Problems

Neural Information Processing Systems

An artificial neural network (ANN) is commonly modeled by a threshold circuit, a network of interconnected processing units called linear threshold gates. The depth of a network represents the number of unit delays or the time for parallel computation. The SIze of a circuit is the number of gates and measures the amount of hardware. It was known that traditional logic circuits consisting of only unbounded fan-in AND, OR, NOT gates would require at least O(log n/log log n) depth to compute common arithmetic functions such as the product or the quotient of two n-bit numbers, unless we allow the size (and fan-in) to increase exponentially (in n). We show in this paper that ANNs can be much more powerful than traditional logic circuits. In particular, we prove that that iterated addition can be computed by depth-2 ANN, and multiplication and division can be computed by depth-3 ANNs with polynomial size and polynomially bounded integer weights, respectively. Moreover, it follows from known lower bound results that these ANNs are optimal in depth. We also indicate that these techniques can be applied to construct polynomial-size depth-3 ANN for powering, and depth-4 ANN for mUltiple product.


Combining Neural and Symbolic Learning to Revise Probabilistic Rule Bases

Neural Information Processing Systems

Recently, both connectionist and symbolic methods have been developed for biasing learning with prior knowledge lFu, 1989; Towell et a/., 1990; Ourston and Mooney, 1990]. Most ofthese methods revise an imperfect knowledge base (usually obtained from a domain expert) to fit a set of empirical data. Some of these methods have been successfully applied to real-world tasks, such as recognizing promoter sequences in DNA [Towell et ai., 1990; Ourston and Mooney, 1990]. The results demonstrate that revising an expert-given knowledge base produces more accurate results than learning from training data alone.


Metamorphosis Networks: An Alternative to Constructive Models

Neural Information Processing Systems

Given a set oft raining examples, determining the appropriate number of free parameters is a challenging problem. Constructive learning algorithms attempt to solve this problem automatically by adding hidden units, and therefore free parameters, during learning. We explore an alternative class of algorithms-called metamorphosis algorithms-in which the number of units is fixed, but the number of free parameters gradually increases during learning. The architecture we investigate is composed of RBF units on a lattice, which imposes flexible constraints on the parameters of the network. Virtues of this approach include variable subset selection, robust parameter selection, multiresolution processing, and interpolation of sparse training data.


Topography and Ocular Dominance with Positive Correlations

Neural Information Processing Systems

This is motivated by experimental evidence that these phenomena may be subserved by the same mechanisms. An important aspect of this model is that ocular dominance segregation can occur when input activity is both distributed, and positively correlated between the eyes. This allows investigation of the dependence of the pattern of ocular dominance stripes on the degree of correlation between the eyes: it is found that increasing correlation leads to narrower stripes. Experiments are suggested to test whether such behaviour occurs in the natural system.


Hybrid Circuits of Interacting Computer Model and Biological Neurons

Neural Information Processing Systems

We demonstrate the use of a digital signal processing board to construct hybrid networks consisting of computer model neurons connected to a biological neural network. This system operates in real time.