AITopics

Learning curves show how a neural network is improved as the number of t.raiuing examples increases and how it is related to the network complexity. The present paper clarifies asymptotic properties and their relation of t.wo learning curves, one concerning the predictive loss or generalization loss and the other the training loss. The result gives a natural definition of the complexity of a neural network. Moreover, it provides a new criterion of model selection.

artificial intelligence, neural network, ylx, (15 more...)

Country: Asia > Japan (0.18)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Kirk, David B., Kerns, Douglas, Fleischer, Kurt, Barr, Alan H.

Analog VLSI Implementation of Multi-dimensional Gradient Descent

The implementation uses noise injection and multiplicative correlation to estimate derivatives, as in [Anderson, Kerns 92]. One intended application of this technique is setting circuit parameters on-chip automatically, rather than manually [Kirk 91]. Gradient descent optimization may be used to adjust synapse weights for a backpropagation or other on-chip learning implementation. The approach combines the features of continuous multidimensional gradient descent and the potential for an annealing style of optimization. We present data measured from our analog VLSI implementation. 1 Introduction This work is similar to [Anderson, Kerns 92], but represents two advances. First, we describe the extension of the technique to multiple dimensions. Second, we demonstrate an implementation of the multidimensional technique in analog VLSI, and provide results measured from the chip. Unlike previous work using noise sources in adaptive systems, we use the noise as a means of estimating the gradient of a function f(y), rather than performing an annealing process [Alspector 88]. We also estimate gr-;:dients continuously in position and time, in contrast to [Umminger 89] and [J abri 91], which utilize discrete position gradient estimates.

analog vlsi implementation, artificial intelligence, machine learning, (12 more...)

Country: North America > United States > California (0.49)

Industry: Semiconductors & Electronics (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.89)

Schwarze, Holm, Hertz, John A.

Statistical Mechanics of Learning in a Large Committee Machine

We use statistical mechanics to study generalization in large committee machines. For an architecture with nonoverlapping receptive fields a replica calculation yields the generalization error in the limit of a large number of hidden units.

artificial intelligence, generalization error, neural network, (16 more...)

Country: North America > United States (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)

Learning Sequential Tasks by Incrementally Adding Higher Orders

Ring, Mark

An incremental, higher-order, non-recurrent network combines two properties found to be useful for learning sequential tasks: higherorder connections and incremental introduction of new units. The network adds higher orders when needed by adding new units that dynamically modify connection weights. Since the new units modify the weights at the next time-step with information from the previous step, temporal tasks can be learned without the use of feedback, thereby greatly simplifying training. Furthermore, a theoretically unlimited number of units can be added to reach into the arbitrarily distant past. Experiments with the Reber grammar have demonstrated speedups of two orders of magnitude over recurrent networks.

artificial intelligence, neural network, recurrent network, (14 more...)

Country:

North America > United States > California (0.31)
North America > United States > Texas > Travis County > Austin (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Baldi, Pierre, Chauvin, Yves, Hunkapiller, Tim, McClure, Marcella A.

Hidden Markov Models in Molecular Biology: New Algorithms and Applications

Hidden Markov Models (HMMs) can be applied to several important problems in molecular biology. We introduce a new convergent learning algorithm for HMMs that, unlike the classical Baum-Welch algorithm is smooth and can be applied online or in batch mode, with or without the usual Viterbi most likely path approximation. Left-right HMMs with insertion and deletion states are then trained to represent several protein families including immunoglobulins and kinases. In all cases, the models derived capture all the important statistical properties of the families and can be used efficiently in a number of important tasks such as multiple alignment, motif detection, and classification.

health & medicine, immunology, sequence, (17 more...)

Country: North America > United States > California (0.70)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.92)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Lee, Wei-Tsih, Pearson, John

A Hybrid Linear/Nonlinear Approach to Channel Equalization Problems

Channel equalization problem is an important problem in high-speed communications. The sequences of symbols transmitted are distorted by neighboring symbols. Traditionally, the channel equalization problem is considered as a channel-inversion operation. One problem of this approach is that there is no direct correspondence between error probability and residual error produced by the channel inversion operation. In this paper, the optimal equalizer design is formulated as a classification problem. The optimal classifier can be constructed by Bayes decision rule. In general it is nonlinear. An efficient hybrid linear/nonlinear equalizer approach has been proposed to train the equalizer. The error probability of new linear/nonlinear equalizer has been shown to be better than a linear equalizer in an experimental channel. 1 INTRODUCTION

artificial intelligence, bayesian inference, equalizer, (16 more...)

Country: North America > United States (0.15)

Genre: Research Report > New Finding (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.74)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Helmke, Uwe, Williamson, Robert C.

Rational Parametrizations of Neural Networks

IR is typically a sigmoidal function such as (1.2) but other choices than (1.2) are possible and of interest.

artificial intelligence, neural network, rational function, (14 more...)

Country:

Europe > Germany (0.14)
Oceania > Australia (0.14)

Industry: Aerospace & Defense (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)

DasGupta, Bhaskar, Schnitger, Georg

The Power of Approximating: a Comparison of Activation Functions

We compare activation functions in terms of the approximation power of their feedforward nets. We consider the case of analog as well as boolean input. 1 Introduction

activation function, artificial intelligence, neural network, (14 more...)

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Learning to categorize objects using temporal coherence

Becker, Suzanna

The invariance of an objects' identity as it transformed over time provides a powerful cue for perceptual learning. We present an unsupervised learning procedure which maximizes the mutual information between the representations adopted by a feed-forward network at consecutive time steps. We demonstrate that the network can learn, entirely unsupervised, to classify an ensemble of several patterns by observing pattern trajectories, even though there are abrupt transitions from one object to another between trajectories. The same learning procedure should be widely applicable to a variety of perceptual learning tasks. 1 INTRODUCTION A promising approach to understanding human perception is to try to model its developmental stages. There is ample evidence that much of perception is learned.

artificial intelligence, neural network, temporal coherence, (17 more...)

Country: North America > Canada > Ontario > Hamilton (0.14)

Genre: Research Report (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Harmonic Grammars for Formal Languages

Smolensky, Paul

Basic connectionist principles imply that grammars should take the form of systems of parallel soft constraints defining an optimization problem the solutions to which are the well-formed structures in the language. Such Harmonic Grammars have been successfully applied to a number of problems in the theory of natural languages. Here it is shown that formal languages too can be specified by Harmonic Grammars, rather than by conventional serial rewrite rule systems. 1 HARMONIC GRAMMARS In collaboration with Geraldine Legendre, Yoshiro Miyata, and Alan Prince, I have been studying how symbolic computation in human cognition can arise naturally as a higher-level virtual machine realized in appropriately designed lower-level connectionist networks. The basic computational principles of the approach are these: (1) a. \Vhell analyzed at the lower level, mental representations are distributed patterns of connectionist activity; when analyzed at a higher level, these same representations constitute symbolic structures.

logic programming, neural network, smolensky, (21 more...)