Information Technology
Probability Estimation from a Database Using a Gibbs Energy Model
Miller, John W., Goodman, Rodney M.
We present an algorithm for creating a neural network which produces accurateprobability estimates as outputs. The network implements aGibbs probability distribution model of the training database. This model is created by a new transformation relating the joint probabilities of attributes in the database to the weights (Gibbs potentials) of the distributed network model. The theory of this transformation is presented together with experimental results. Oneadvantage of this approach is the network weights are prescribed without iterative gradient descent. Used as a classifier the network tied or outperformed published results on a variety of databases.
Destabilization and Route to Chaos in Neural Networks with Random Connectivity
Doyon, Bernard, Cessac, Bruno, Quoy, Mathias, Samuelides, Manuel
The occurence of chaos in recurrent neural networks is supposed to depend on the architecture and on the synaptic coupling strength. It is studied here for a randomly diluted architecture. By normalizing the variance of synaptic weights, we produce a bifurcation parameter, dependent on this variance and on the slope of the transfer function but independent of the connectivity, that allows a sustained activity and the occurence of chaos when reaching a critical value. Even for weak connectivity and small size, we find numerical results in accordance with the theoretical ones previously established for fully connected infinite sized networks. Moreover the route towards chaos is numerically checked to be a quasi-periodic one, whatever the type of the first bifurcation is (Hopf bifurcation, pitchfork or flip).
A Knowledge-Based Model of Geometry Learning
Towell, Geoffrey, Lehrer, Richard
We propose a model of the development of geometric reasoning in children that explicitly involves learning. The model uses a neural network that is initialized with an understanding of geometry similar to that of second-grade children. Through the presentation of a series of examples, the model is shown to develop an understanding of geometry similar to that of fifth-grade children who were trained using similar materials.
Object-Based Analog VLSI Vision Circuits
Koch, Christof, Mathur, Binnal, Liu, Shih-Chii, Harris, John G., Luo, Jin, Sivilotti, Massimo
We describe two successfully working, analog VLSI vision circuits that move beyond pixel-based early vision algorithms. One circuit, implementing the dynamic wires model, provides for dedicated lines of communication among groups of pixels that share a common property. The chip uses the dynamic wires model to compute the arclength of visual contours. Another circuit labels all points inside a given contour with one voltage and all other with another voltage. Itsbehavior is very robust, since small breaks in contours are automatically sealed, providing for Figure-Ground segregation in a noisy environment. Both chips are implemented using networks of resistors and switches and represent a step towards object level processing since a single voltage value encodes the property of an ensemble of pixels.
On the Use of Evidence in Neural Networks
The Bayesian "evidence" approximation has recently been employed to determine the noise and weight-penalty terms used in back-propagation. This paper shows that for neural nets it is far easier to use the exact result than it is to use the evidence approximation. Moreover, unlike the evidence approximation,the exact result neither has to be re-calculated for every new data set, nor requires the running of computer code (the exact result is closed form). In addition, it turns out that the evidence procedure's MAPestimate for neural nets is, in toto, approximation error. Another advantage of the exact analysis is that it does not lead one to incorrect intuition, like the claim that using evidence one can "evaluate different priors in light of the data". This paper also discusses sufficiency conditions for the evidence approximation to hold, why it can sometimes give "reasonable" results, etc.
Single-Iteration Threshold Hamming Networks
Meilijson, Isaac, Ruppin, Eytan, Sipper, Moshe
Isaac Meilijson EytanRuppin Moshe Sipper School of Mathematical Sciences Raymond and Beverly Sackler Faculty of Exact Sciences Tel Aviv University, 69978 Tel Aviv, Israel Abstract We analyze in detail the performance of a Hamming network classifying inputsthat are distorted versions of one of its m stored memory patterns. The activation function of the memory neurons in the original Hamming network is replaced by a simple threshold function. The THN drastically reduces the time and space complexity of Hamming Network classifiers. 1 Introduction Originally presented in (Steinbuch 1961, Taylor 1964) the Hamming network (HN) has received renewed attention in recent years (Lippmann et. The HN calculates the Hamming distance between the input pattern and each memory pattern, and selects the memory with the smallest distance. It is composed of two subnets: The similarity subnet, consisting of an n-neuron input layer connected with an m-neuron memory layer, calculates the number of equal bits between the input and each memory pattern.
Unsupervised Discrimination of Clustered Data via Optimization of Binary Information Gain
Schraudolph, Nicol N., Sejnowski, Terrence J.
We present the information-theoretic derivation of a learning algorithm that clusters unlabelled data with linear discriminants. In contrast to methods that try to preserve information about the input patterns, we maximize the information gained from observing the output of robust binary discriminators implemented with sigmoid nodes. We deri ve a local weight adaptation rule via gradient ascent in this objective, demonstrate its dynamics on some simple data sets, relate our approach to previous work and suggest directions in which it may be extended.
A Note on Learning Vector Quantization
Sa, Virginia R. de, Ballard, Dana H.
Vector Quantization is useful for data compression. Competitive Learning whichminimizes reconstruction error is an appropriate algorithm for vector quantization of unlabelled data. Vector quantization of labelled data for classification has a different objective, to minimize the number of misclassifications, and a different algorithm is appropriate. We show that a variant of Kohonen's LVQ2.1 algorithm can be seen as a multiclass extensionof an algorithm which in a restricted 2 class case can be proven to converge to the Bayes optimal classification boundary. We compare the performance of the LVQ2.1 algorithm to that of a modified version having a decreasing window and normalized step size, on a ten class vowel classification problem.
Directional-Unit Boltzmann Machines
Zemel, Richard S., Williams, Christopher K. I., Mozer, Michael C.
University of Colorado Boulder, CO 80309-0430 Abstract We present a general formulation for a network of stochastic directional units.This formulation is an extension of the Boltzmann machine in which the units are not binary, but take on values in a cyclic range, between 0 and 271' radians. The conditional distribution of a unit's stochastic state is a circular version of the Gaussian probability distribution, known as the von Mises distribution. This combination of a value and a certainty provides additional representational powerin a unit. Many kinds of information can naturally be represented in terms of angular, or directional, variables. A circular range forms a suitable representation for explicitly directional information, such as wind direction, as well as for information where the underlying range is periodic, such as days of the week or months of the year.