Goto

Collaborating Authors

 Country



Non-Linear Dimensionality Reduction

Neural Information Processing Systems

A method for creating a nonlinear encoder-decoder for multidimensional data with compact representations is presented. The commonly used technique of autoassociation is extended to allow nonlinear representations, and an objective functionwhich penalizes activations of individual hidden units is shown to result in minimum dimensional encodings with respect to allowable error in reconstruction. 1 INTRODUCTION Reducing dimensionality of data with minimal information loss is important for feature extraction, compact coding and computational efficiency. The data can be tranformed into "good" representations for further processing, constraints among feature variables may be identified, and redundancy eliminated. Many algorithms are exponential in the dimensionality of the input, thus even reduction by a single dimension may provide valuable computational savings. Autoassociating feedforward networks with one hidden layer have been shown to extract the principal components of the data (Baldi & Hornik, 1988). Such networks have been used to extract features and develop compact encodings of the data (Cottrell, Munro & Zipser, 1989). Principal Components Analysis projects the data into a linear subspace -email: demers@cs.ucsd.edu


History-Dependent Attractor Neural Networks

Neural Information Processing Systems

We present a methodological framework enabling a detailed description ofthe performance of Hopfield-like attractor neural networks (ANN) in the first two iterations. Using the Bayesian approach, wefind that performance is improved when a history-based term is included in the neuron's dynamics. A further enhancement of the network's performance is achieved by judiciously choosing the censored neurons (those which become active in a given iteration) onthe basis of the magnitude of their post-synaptic potentials. Thecontribution of biologically plausible, censored, historydependent dynamicsis especially marked in conditions of low firing activity and sparse connectivity, two important characteristics of the mammalian cortex. In such networks, the performance attained ishigher than the performance of two'independent' iterations, whichrepresents an upper bound on the performance of history-independent networks.


Single-Iteration Threshold Hamming Networks

Neural Information Processing Systems

Isaac Meilijson EytanRuppin Moshe Sipper School of Mathematical Sciences Raymond and Beverly Sackler Faculty of Exact Sciences Tel Aviv University, 69978 Tel Aviv, Israel Abstract We analyze in detail the performance of a Hamming network classifying inputsthat are distorted versions of one of its m stored memory patterns. The activation function of the memory neurons in the original Hamming network is replaced by a simple threshold function. The THN drastically reduces the time and space complexity of Hamming Network classifiers. 1 Introduction Originally presented in (Steinbuch 1961, Taylor 1964) the Hamming network (HN) has received renewed attention in recent years (Lippmann et. The HN calculates the Hamming distance between the input pattern and each memory pattern, and selects the memory with the smallest distance. It is composed of two subnets: The similarity subnet, consisting of an n-neuron input layer connected with an m-neuron memory layer, calculates the number of equal bits between the input and each memory pattern.


Predicting Complex Behavior in Sparse Asymmetric Networks

Neural Information Processing Systems

Recurrent networks of threshold elements have been studied intensively asassociative memories and pattern-recognition devices. While most research has concentrated on fully-connected symmetric networks.


Destabilization and Route to Chaos in Neural Networks with Random Connectivity

Neural Information Processing Systems

The occurence of chaos in recurrent neural networks is supposed to depend on the architecture and on the synaptic coupling strength. It is studied here for a randomly diluted architecture. By normalizing the variance of synaptic weights, we produce a bifurcation parameter, dependent on this variance and on the slope of the transfer function but independent of the connectivity, that allows a sustained activity and the occurence of chaos when reaching a critical value. Even for weak connectivity and small size, we find numerical results in accordance with the theoretical ones previously established for fully connected infinite sized networks. Moreover the route towards chaos is numerically checked to be a quasi-periodic one, whatever the type of the first bifurcation is (Hopf bifurcation, pitchfork or flip).


Probability Estimation from a Database Using a Gibbs Energy Model

Neural Information Processing Systems

We present an algorithm for creating a neural network which produces accurateprobability estimates as outputs. The network implements aGibbs probability distribution model of the training database. This model is created by a new transformation relating the joint probabilities of attributes in the database to the weights (Gibbs potentials) of the distributed network model. The theory of this transformation is presented together with experimental results. Oneadvantage of this approach is the network weights are prescribed without iterative gradient descent. Used as a classifier the network tied or outperformed published results on a variety of databases.


Unsupervised Discrimination of Clustered Data via Optimization of Binary Information Gain

Neural Information Processing Systems

We present the information-theoretic derivation of a learning algorithm that clusters unlabelled data with linear discriminants. In contrast to methods that try to preserve information about the input patterns, we maximize the information gained from observing the output of robust binary discriminators implemented with sigmoid nodes. We deri ve a local weight adaptation rule via gradient ascent in this objective, demonstrate its dynamics on some simple data sets, relate our approach to previous work and suggest directions in which it may be extended.


Self-Organizing Rules for Robust Principal Component Analysis

Neural Information Processing Systems

Using statistical physicstechniques including the Gibbs distribution, binary decision fields and effective energies, we propose self-organizing PCA rules which are capable of resisting outliers while fulfilling various PCA-related tasks such as obtaining the first principal component vector,the first k principal component vectors, and directly finding the subspace spanned by the first k vector principal component vectorswithout solving for each vector individually. Comparative experimentshave shown that the proposed robust rules improve the performances of the existing PCA algorithms significantly whenoutliers are present.


Diffusion Approximations for the Constant Learning Rate Backpropagation Algorithm and Resistence to Local Minima

Neural Information Processing Systems

E (0,00), remains in spite of many real (and 459 460 Finnoff imagined)deficiencies the most widely used network training algorithm, and a vast body of literature documents its general applicability and robustness. In this paper we will draw on the highly developed literature of stochastic approximation theory todemonstrate several asymptotic properties of simple backpropagation.