AITopics

Learning curves show how a neural network is improved as the number of t.raiuing examples increases and how it is related to the network complexity. The present paper clarifies asymptotic properties and their relation of t.wo learning curves, one concerning the predictive loss or generalization loss and the other the training loss. The result gives a natural definition of the complexity of a neural network. Moreover, it provides a new criterion of model selection.

artificial intelligence, machine learning, ylx, (14 more...)

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.17)
North America > United States > California > San Mateo County > San Mateo (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

A Parallel Gradient Descent Method for Learning in Analog VLSI Neural Networks

Alspector, J., Meir, R., Yuhas, B., Jayakumar, A., Lippe, D.

Typical methods for gradient descent in neural network learning involve calculation of derivatives based on a detailed knowledge of the network model. This requires extensive, time consuming calculations for each pattern presentation and high precision that makes it difficult to implement in VLSI. We present here a perturbation technique that measures, not calculates, the gradient. Since the technique uses the actual network as a measuring device, errors in modeling neuron activation and synaptic weights do not cause errors in gradient descent. The method is parallel in nature and easy to implement in VLSI. We describe the theory of such an algorithm, an analysis of its domain of applicability, some simulations using it and an outline of a hardware implementation.

artificial intelligence, machine learning, perturbation, (13 more...)

Country:

North America > United States > California > San Mateo County > San Mateo (0.05)
North America > United States > New York (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
(2 more...)

Industry: Semiconductors & Electronics (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.94)

LeCun, Yann, Simard, Patrice Y., Pearlmutter, Barak

Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors

We propose a very simple, and well principled way of computing the optimal step size in gradient descent algorithms. The online version is very efficient computationally, and is applicable to large backpropagation networks trained on large data sets. The main ingredient is a technique for estimating the principal eigenvalue(s) and eigenvector(s) of the objective function's second derivative matrix (Hessian), which does not require to even calculate the Hessian. Several other applications of this technique are proposed for speeding up learning, or for eliminating useless parameters. 1 INTRODUCTION Choosing the appropriate learning rate, or step size, in a gradient descent procedure such as backpropagation, is simultaneously one of the most crucial and expertintensive part of neural-network learning. We propose a method for computing the best step size which is both well-principled, simple, very cheap computationally, and, most of all, applicable to online training with large networks and data sets.

artificial intelligence, eigenvalue, machine learning, (13 more...)

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > Colorado > Denver County > Denver (0.05)
(2 more...)

Industry: Education > Educational Setting > Online (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.58)

Gluck, Mark A., Myers, Catherine E.

Adaptive Stimulus Representations: A Computational Theory of Hippocampal-Region Function

We present a theory of cortico-hippocampal interaction in discrimination learning. The hippocampal region is presumed to form new stimulus representations which facilitate learning by enhancing the discriminability of predictive stimuli and compressing stimulus-stimulus redundancies. The cortical and cerebellar regions, which are the sites of long-term memory.

artificial intelligence, machine learning, representation, (14 more...)

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > New York (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
(2 more...)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Diffusion Approximations for the Constant Learning Rate Backpropagation Algorithm and Resistence to Local Minima

Finnoff, William

TJ of the gradient updates is held constant (simple backpropagation).

artificial intelligence, machine learning, training process, (12 more...)

Country: Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.68)

A Recurrent Neural Network for Generation of Occular Saccades

Massone, Lina L.E.

Electrophysiological studies (Cynader and Berman 1972, Robinson 1972) showed that the intermediate layer of SC is topographically organized into a motor map. The location of active neurons in this area was found to be related to the oculomotor error (Le.

artificial intelligence, machine learning, neuron, (17 more...)

Country: North America > United States > Illinois > Cook County > Chicago (0.05)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.43)

Harmonic Grammars for Formal Languages

Smolensky, Paul

Basic connectionist principles imply that grammars should take the form of systems of parallel soft constraints defining an optimization problem the solutions to which are the well-formed structures in the language. Such Harmonic Grammars have been successfully applied to a number of problems in the theory of natural languages. Here it is shown that formal languages too can be specified by Harmonic Grammars, rather than by conventional serial rewrite rule systems. 1 HARMONIC GRAMMARS In collaboration with Geraldine Legendre, Yoshiro Miyata, and Alan Prince, I have been studying how symbolic computation in human cognition can arise naturally as a higher-level virtual machine realized in appropriately designed lower-level connectionist networks. The basic computational principles of the approach are these: (1) a. \Vhell analyzed at the lower level, mental representations are distributed patterns of connectionist activity; when analyzed at a higher level, these same representations constitute symbolic structures.

harmonic grammar, node, smolensky, (15 more...)

Country:

North America > United States > Colorado > Boulder County > Boulder (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > Illinois > Cook County > Chicago (0.05)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.48)

Nowlan, Steven J., Sejnowski, Terrence J.

Filter Selection Model for Generating Visual Motion Signals

We present a model of how MT cells aggregate responses from VI to form such a velocity representation. Two different sets of units, with local receptive fields, receive inputs from motion energy filters. One set of units forms estimates of local motion, while the second set computes the utility of these estimates. Outputs from this second set of units "gate" the outputs from the first set through a gain control mechanism. This active process for selecting only a subset of local motion responses to integrate into more global responses distinguishes our model from previous models of velocity estimation.

filter selection model, receptive field location, selection unit, (15 more...)