AITopics

One method proposed for improving the generalization capability of a feedforward network trained with the backpropagation algorithm is to use artificial training vectors which are obtained by adding noise to the original training vectors. We discuss the connection of such backpropagation training with noise to kernel density and kernel regression estimation. We compare by simulated examples (1) backpropagation, (2) backpropagation with noise, and (3) kernel regression in mapping estimation and pattern classification contexts.

backpropagation, noise, vector, (10 more...)

Country: Europe > Finland > Uusimaa > Helsinki (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (1.00)

Krogh, Anders, Hertz, John A.

A Simple Weight Decay Can Improve Generalization

It has been observed in numerical simulations that a weight decay can improve generalization in a feed-forward neural network.

generalization error, vector, weight decay, (12 more...)

Country:

North America > United States > California > Santa Cruz County > Santa Cruz (0.14)
Europe > Denmark > Capital Region > Copenhagen (0.05)
North America > United States > New York (0.04)

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Bertoni, Alberto, Campadelli, Paola, Morpurgo, Anna, Panizza, Sandra

Polynomial Uniform Convergence of Relative Frequencies to Probabilities

We define the concept of polynomial uniform convergence of relative frequencies to probabilities in the distribution-dependent context.

probability, relative frequency, uniform convergence, (10 more...)

Country: Europe > Italy > Lombardy > Milan (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.50)

Simard, Patrice, Victorri, Bernard, LeCun, Yann, Denker, John

Tangent Prop - A formalism for specifying selected invariances in an adaptive network

In many machine learning applications, one has access, not only to training data, but also to some high-level a priori knowledge about the desired behavior of the system. For example, it is known in advance that the output of a character recognizer should be invariant with respect to small spatial distortions of the input images (translations, rotations, scale changes, etcetera). We have implemented a scheme that allows a network to learn the derivative of its outputs with respect to distortion operators of our choosing. This not only reduces the learning time and the amount of training data, but also provides a powerful language for specifying what generalizations we wish the network to perform. 1 INTRODUCTION In machine learning, one very often knows more about the function to be learned than just the training data. An interesting case is when certain directional derivatives of the desired function are known at certain points.

invariance, tangent vector, transformation, (13 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > New York (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Haussler, David, Kearns, Michael, Opper, Manfred, Schapire, Robert

Estimating Average-Case Learning Curves Using Bayesian, Statistical Physics and VC Dimension Methods

In this paper we investigate an average-case model of concept learning, and give results that place the popular statistical physics and VC dimension theories of learning curve behavior in a common framework.

algorithm, gibbs algorithm, probability, (13 more...)

Country:

North America > United States > California > Santa Cruz County > Santa Cruz (0.14)
North America > United States > New Jersey (0.05)
North America > United States > New York (0.04)
Europe > Germany (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems

Moody, John E.

We present an analysis of how the generalization performance (expected test set error) relates to the expected training set error for nonlinear learning systems, such as multilayer perceptrons and radial basis functions.

akaike, effective number, peff, (13 more...)

Country:

North America > United States > New York (0.05)
North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
North America > United States > Connecticut > New Haven County > New Haven (0.04)
Europe > Hungary > Budapest > Budapest (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

Bayesian Model Comparison and Backprop Nets

MacKay, David J. C.

The Bayesian model comparison framework is reviewed, and the Bayesian Occam's razor is explained. This framework can be applied to feedforward networks, making possible (1) objective comparisons between solutions using alternative network architectures; (2) objective choice of magnitude and type of weight decay terms; (3) quantified estimates of the error bars on network parameters and on network output. The framework also generates a measure of the effective number of parameters determined by the data. The relationship of Bayesian model comparison to recent work on prediction of generalisation ability (Guyon et al., 1992, Moody, 1992) is discussed.

error bar, inference, occam factor, (12 more...)

Country:

North America > United States > California > Los Angeles County > Pasadena (0.04)
Europe > United Kingdom (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Principles of Risk Minimization for Learning Theory

Vapnik, V.

Learning is posed as a problem of function estimation, for which two principles of solution are considered: empirical risk minimization and structural risk minimization. These two principles are applied to two different statements of the function estimation problem: global and local. Systematic improvements in prediction power are illustrated in application to zip-code recognition.

algorithm, minimization, risk minimization, (13 more...)

Country:

North America > United States > New York (0.04)
North America > United States > California (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Asia > Russia (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Constrained Optimization Applied to the Parameter Setting Problem for Analog Circuits

Kirk, David, Fleischer, Kurt, Watts, Lloyd, Barr, Alan

We use constrained optimization to select operating parameters for two circuits: a simple 3-transistor square root circuit, and an analog VLSI artificial cochlea. This automated method uses computer controlled measurement and test equipment to choose chip parameters which minimize the difference between the actual circuit's behavior and a specified goal behavior. Choosing the proper circuit parameters is important to compensate for manufacturing deviations or adjust circuit performance within a certain range. As biologically-motivated analog VLSI circuits become increasingly complex, implying more parameters, setting these parameters by hand will become more cumbersome. Thus an automated parameter setting method can be of great value [Fleischer 90].

constrained optimization applied, error metric, optimization, (13 more...)

Country:

North America > United States > California > Los Angeles County > Pasadena (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Semiconductors & Electronics (0.57)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

A Neural Network for Motion Detection of Drift-Balanced Stimuli

Tunley, Hilary

This paper briefly describes an artificial neural network for preattentive visual processing. The network is capable of determiuing image motioll in a type of stimulus which defeats most popular methods of motion detect.ion

detector, motion detection, neural network, (13 more...)

Country:

North America > United States > California > Orange County > Irvine (0.04)
Europe > United Kingdom > Scotland > City of Glasgow > Glasgow (0.04)
Europe > United Kingdom > England > East Sussex > Brighton (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Industry: Commercial Services & Supplies > Security & Alarm Services (0.45)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)