AITopics

We have trained networks of E - II units with short-range connections to simulate simple cellular automata that exhibit complex or chaotic behaviour. Three levels of learning are possible (in decreasing order of difficulty): learning the underlying automaton rule, learning asymptotic dynamical behaviour, and learning to extrapolate the training history. The levels of learning achieved with and without weight sharing for different automata provide new insight into their dynamics.

history, training history, weight sharing, (14 more...)

Country:

North America > United States > New Mexico > Los Alamos County > Los Alamos (0.05)
Europe > Denmark > Capital Region > Copenhagen (0.05)
North America > United States > Maryland > Montgomery County > Bethesda (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Helmke, Uwe, Williamson, Robert C.

Rational Parametrizations of Neural Networks

IR is typically a sigmoidal function such as (1.2) but other choices than (1.2) are possible and of interest.

rational function, rational parametrization, representation, (13 more...)

Country:

North America > United States > New York (0.05)
Europe > Germany > Bavaria > Regensburg (0.05)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Aerospace & Defense (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)

DasGupta, Bhaskar, Schnitger, Georg

The Power of Approximating: a Comparison of Activation Functions

We compare activation functions in terms of the approximation power of their feedforward nets. We consider the case of analog as well as boolean input. 1 Introduction

activation function, polynomial, threshold, (13 more...)

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Pennsylvania > Centre County > University Park (0.04)
North America > United States > Michigan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Murata, Noboru, Yoshizawa, Shuji, Amari, Shun-ichi

Learning Curves, Model Selection and Complexity of Neural Networks

Learning curves show how a neural network is improved as the number of t.raiuing examples increases and how it is related to the network complexity. The present paper clarifies asymptotic properties and their relation of t.wo learning curves, one concerning the predictive loss or generalization loss and the other the training loss. The result gives a natural definition of the complexity of a neural network. Moreover, it provides a new criterion of model selection.

model selection and complexity, predictive loss, ylx, (12 more...)

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.17)
North America > United States > California > San Mateo County > San Mateo (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Network Model Selection Using Asymptotic Jackknife Estimator and Cross-Validation Method

Liu, Yong

Two theorems and a lemma are presented about the use of jackknife estimator and the cross-validation method for model selection. Theorem 1 gives the asymptotic form for the jackknife estimator. Combined with the model selection criterion, this asymptotic form can be used to obtain the fit of a model. The model selection criterion we used is the negative of the average predictive likehood, the choice of which is based on the idea of the cross-validation method. Lemma 1 provides a formula for further exploration of the asymptotics of the model selection criterion. Theorem 2 gives an asymptotic form of the model selection criterion for the regression case, when the parameters optimization criterion has a penalty term. Theorem 2 also proves the asymptotic equivalence of Moody's model selection criterion (Moody, 1992) and the cross-validation method, when the distance measure between response y and regression function takes the form of a squared difference. 1 INTRODUCTION Selecting a model for a specified problem is the key to generalization based on the training data set.

criterion, estimator, model selection criterion, (10 more...)

Country:

North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (1.00)

Golea, Mostefa, Marchand, Mario, Hancock, Thomas R.

On Learning µ-Perceptron Networks with Binary Weights

Abstract Missing

binary weight

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.40)

DeMers, David, Cottrell, Garrison W.

Non-Linear Dimensionality Reduction

A method for creating a nonlinear encoder-decoder for multidimensional data with compact representations is presented. The commonly used technique of autoassociation is extended to allow nonlinear representations, and an objective function which penalizes activations of individual hidden units is shown to result in minimum dimensional encodings with respect to allowable error in reconstruction. 1 INTRODUCTION Reducing dimensionality of data with minimal information loss is important for feature extraction, compact coding and computational efficiency. The data can be tranformed into "good" representations for further processing, constraints among feature variables may be identified, and redundancy eliminated. Many algorithms are exponential in the dimensionality of the input, thus even reduction by a single dimension may provide valuable computational savings. Autoassociating feed forward networks with one hidden layer have been shown to extract the principal components of the data (Baldi & Hornik, 1988). Such networks have been used to extract features and develop compact encodings of the data (Cottrell, Munro & Zipser, 1989). Principal Components Analysis projects the data into a linear subspace -email: demers@cs.ucsd.edu

dimensionality, representation, representation layer, (15 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.42)

Meilijson, Isaac, Ruppin, Eytan

History-Dependent Attractor Neural Networks

We present a methodological framework enabling a detailed description of the performance of Hopfield-like attractor neural networks (ANN) in the first two iterations. Using the Bayesian approach, we find that performance is improved when a history-based term is included in the neuron's dynamics. A further enhancement of the network's performance is achieved by judiciously choosing the censored neurons (those which become active in a given iteration) on the basis of the magnitude of their post-synaptic potentials. The contribution of biologically plausible, censored, historydependent dynamics is especially marked in conditions of low firing activity and sparse connectivity, two important characteristics of the mammalian cortex. In such networks, the performance attained is higher than the performance of two'independent' iterations, which represents an upper bound on the performance of history-independent networks.

connectivity, iteration, neuron, (15 more...)

Country:

North America > United States (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)

Industry: Law > Civil Rights & Constitutional Law (0.61)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Doyon, Bernard, Cessac, Bruno, Quoy, Mathias, Samuelides, Manuel

Destabilization and Route to Chaos in Neural Networks with Random Connectivity

The occurence of chaos in recurrent neural networks is supposed to depend on the architecture and on the synaptic coupling strength. It is studied here for a randomly diluted architecture. By normalizing the variance of synaptic weights, we produce a bifurcation parameter, dependent on this variance and on the slope of the transfer function but independent of the connectivity, that allows a sustained activity and the occurence of chaos when reaching a critical value. Even for weak connectivity and small size, we find numerical results in accordance with the theoretical ones previously established for fully connected infinite sized networks. Moreover the route towards chaos is numerically checked to be a quasi-periodic one, whatever the type of the first bifurcation is (Hopf bifurcation, pitchfork or flip).

bifurcation, chaos, neural network, (14 more...)

Country:

Europe > France > Occitanie > Haute-Garonne > Toulouse (0.07)
Europe > Russia (0.04)
Asia > Russia (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

On the Use of Evidence in Neural Networks

Wolpert, David

The Bayesian "evidence" approximation has recently been employed to determine the noise and weight-penalty terms used in back-propagation. This paper shows that for neural nets it is far easier to use the exact result than it is to use the evidence approximation. Moreover, unlike the evidence approximation, the exact result neither has to be re-calculated for every new data set, nor requires the running of computer code (the exact result is closed form). In addition, it turns out that the evidence procedure's MAP estimate for neural nets is, in toto, approximation error. Another advantage of the exact analysis is that it does not lead one to incorrect intuition, like the claim that using evidence one can "evaluate different priors in light of the data". This paper also discusses sufficiency conditions for the evidence approximation to hold, why it can sometimes give "reasonable" results, etc.

evidence approximation, pew, posterior, (13 more...)

Country: North America > United States > New Mexico > Santa Fe County > Santa Fe (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)