AITopics

A general feature of the cerebral cortex is its massive interconnectivity - it has been estimated anatomically [19] that cortical neurons receive upwards of 5,000 synapses, the majority of which originate from other nearby cortical neurons. Numerous experiments in primary visual cortex (VI) have revealed strongly nonlinear interactions between stimulus elements which activate classical and nonclassical receptive field regions. Recurrent cortical connections likely contribute substantially to these effects. However, most theories of visual processing have either assumed a feedforward processing scheme [7], or have used recurrent interactions to account for isolated effects only [1, 16, 18]. Since nonlinear systems cannot in general be taken apart and analyzed in pieces, it is not clear what one learns by building a recurrent model that only accounts for one, or very few phenomena. Here we develop a relatively simple model of recurrent interactions in VI, that reflects major anatomical and physiological features of intracortical connectivity, and simultaneously accounts for a wide range of phenomena observed physiologically. All phenomena we address are strongly nonlinear, and cannot be explained by linear feedforward models.

neural network, neurology, response function, (16 more...)

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Cognitive Science > Neuroscience (0.34)

Jaakkola, Tommi, Jordan, Michael I.

Recursive Algorithms for Approximating Probabilities in Graphical Models

Department of Brain and Cognitive Sciences Massachusetts Institute of Technology Cambridge, MA 02139 Abstract We develop a recursive node-elimination formalism for efficiently approximating large probabilistic networks. No constraints are set on the network topologies. Yet the formalism can be straightforwardly integratedwith exact methods whenever they are/become applicable. The approximations we use are controlled: they maintain consistentlyupper and lower bounds on the desired quantities at all times. We show that Boltzmann machines, sigmoid belief networks, or any combination (i.e., chain graphs) can be handled within the same framework.

artificial intelligence, machine learning, recursion, (15 more...)

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.38)

Limitations of Self-organizing Maps for Vector Quantization and Multidimensional Scaling

Flexer, Arthur

SaM can be said to do clustering/vector quantization (VQ) and at the same time to preserve the spatial ordering of the input data reflected by an ordering of the code book vectors (cluster centroids) in a one or two dimensional output space, where the latter property is closely related to multidimensional scaling (MDS) in statistics. Although the level of activity and research around the SaM algorithm is quite large (a recent overview by [Kohonen 95] contains more than 1000 citations), only little comparison among the numerous existing variants of the basic approach and also to more traditional statistical techniques of the larger frameworks of VQ and MDS is available. Additionally, thereis only little advice in the literature about how to properly use 446 A.Flexer SOM in order to get optimal results in terms of either vector quantization (VQ) or multidimensional scaling or maybe even both of them. To make the notion of SOM being a tool for "data visualization" more precise, the following question has to be answered: Should SOM be used for doing VQ, MDS, both at the same time or none of them? Two recent comprehensive studies comparing SOM either to traditional VQ or MDS techniques separately seem to indicate that SOM is not competitive when used for either VQ or MDS: [Balakrishnan et al. 94J compare SOM to K-means clustering on 108 multivariate normal clustering problems with known clustering solutions and show that SOM performs significantly worse in terms of data points misclassified

artificial intelligence, code book vector, machine learning, (14 more...)

Country: Europe > Austria > Vienna (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.72)

For Valid Generalization the Size of the Weights is More Important than the Size of the Network

Bartlett, Peter L.

Baum and Haussler [4] used these results to give sample size bounds for multi-layer threshold networks Generalization and the Size ofthe Weights in Neural Networks 135 that grow at least as quickly as the number of weights (see also [7]). However, for pattern classification applications the VC-bounds seem loose; neural networks often perform successfully with training sets that are considerably smaller than the number of weights. This paper shows that for classification problems on which neural networksperform well, if the weights are not too big, the size of the weights determines the generalization performance. In contrast with the function classes and algorithms considered in the VC-theory, neural networks used for binary classification problems have real-valued outputs, and learning algorithms typically attempt to minimize the squared error of the network output over a training set. As well as encouraging the correct classification, this tends to push the output away from zero and towards the target values of { -1, I}.

artificial intelligence, dimension, neural network, (16 more...)

Country:

North America > United States > New York (0.14)
Oceania > Australia (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)

Balancing Between Bagging and Bumping

Heskes, Tom

We compare different methods to combine predictions from neural networkstrained on different bootstrap samples of a regression problem. One of these methods, introduced in [6] and which we here call balancing, is based on the analysis of the ensemble generalization errorinto an ambiguity term and a term incorporating generalization performances of individual networks. We show how to estimate these individual errors from the residuals on validation patterns.Weighting factors for the different networks follow from a quadratic programming problem. On a real-world problem concerning the prediction of sales figures and on the well-known Boston housing data set, balancing clearly outperforms other recently proposedalternatives as bagging [1] and bumping [8]. 1 EARLY STOPPING AND BOOTSTRAPPING Stopped training is a popular strategy to prevent overfitting in neural networks. The complete data set is split up into a training and a validation set.

banking & finance, generalization error, survey article, (19 more...)

Country:

Europe > Netherlands (0.15)
North America > Canada > Ontario > Toronto (0.14)

Industry: Banking & Finance > Real Estate (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.36)

Neural Network Modeling of Speech and Music Signals

Röbel, Alex

Time series prediction is one of the major applications of neural networks. Aftera short introduction into the basic theoretical foundations we argue that the iterated prediction of a dynamical system may be interpreted asa model of the system dynamics. By means of RBF neural networks we describe a modeling approach and extend it to be able to model instationary systems. As a practical test for the capabilities of the method we investigate the modeling of musical and speech signals and demonstrate that the model may be used for synthesis of musical and speech signals. 1 Introduction Since the formulation of the reconstruction theorem by Takens [10] it has been clear that a nonlinear predictor of a dynamical system may be directly derived from a systems time series. The method has been investigated extensively and with good success for the prediction oftime series of nonlinear systems.

artificial intelligence, neural network, time series, (14 more...)

Country: Europe > Germany (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Rohwer, Richard, Morciniec, Michal

The Generalisation Cost of RAMnets

Neural Computing Research Group Aston University Aston Triangle, Birmingham B4 7ET, UK. Abstract Given unlimited computational resources, it is best to use a criterion ofminimal expected generalisation error to select a model and determine its parameters. However, it may be worthwhile to sacrifice somegeneralisation performance for higher learning speed. A method for quantifying sub-optimality is set out here, so that this choice can be made intelligently. Furthermore, the method is applicable to a broad class of models, including the ultra-fast memory-based methods such as RAMnets. This brings the added benefit of providing, for the first time, the means to analyse the generalisation properties of such models in a Bayesian framework . 1 Introduction In order to quantitatively predict the performance of methods such as the ultra-fast RAMnet, which are not trained by minimising a cost function, we develop a Bayesian formalism for estimating the generalisation cost of a wide class of algorithms.

artificial intelligence, bayesian inference, generalisation cost, (16 more...)

Country:

Europe > United Kingdom (0.24)
North America > Canada > Ontario > Toronto (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Orr, Genevieve B., Leen, Todd K.

Using Curvature Information for Fast Stochastic Search

We present an algorithm for fast stochastic gradient descent that uses a nonlinear adaptive momentum scheme to optimize the late time convergence rate. The algorithm makes effective use of curvature information,requires only O(n) storage and computation, and delivers convergence rates close to the theoretical optimum. We demonstrate the technique on linear and large nonlinear backprop networks.

artificial intelligence, momentum, neural network, (17 more...)

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.57)

Bös, Siegfried, Opper, Manfred

Dynamics of Training

A new method to calculate the full training process of a neural network is introduced. No sophisticated methods like the replica trick are used. The results are directly related to the actual number of training steps. Some results are presented here, like the maximal learning rate, an exact description of early stopping, and the necessary number of training steps. Further problems can be addressed with this approach.

artificial intelligence, neural network, training step, (16 more...)

Country: Asia > Japan (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Noisy Spiking Neurons with Temporal Coding have more Computational Power than Sigmoidal Neurons

Maass, Wolfgang

Furthermore it is shown that networks of noisy spiking neurons with temporal coding have a strictly larger computational power than sigmoidal neural nets with the same number of units. 1 Introduction and Definitions We consider a formal model SNN for a §piking neuron network that is basically a reformulation of the spike response model (and of the leaky integrate and fire model) without using 6-functions (see [Maass, 1996a] or [Maass, 1996b] for further backgrou nd).

artificial intelligence, neural network, neuron, (13 more...)

Country: Europe > Austria (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)