AITopics

A new method to calculate the full training process of a neural network is introduced. No sophisticated methods like the replica trick are used. The results are directly related to the actual number of training steps. Some results are presented here, like the maximal learning rate, an exact description of early stopping, and the necessary number of training steps. Further problems can be addressed with this approach.

artificial intelligence, neural network, training step, (16 more...)

Country: Asia > Japan (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Dynamics of Training

Bös, Siegfried, Opper, Manfred

A new method to calculate the full training process of a neural network is introduced. No sophisticated methods like the replica trick are used. The results are directly related to the actual number of training steps. Some results are presented here, like the maximal learning rate, an exact description of early stopping, and the necessary number of training steps. Further problems can be addressed with this approach.

artificial intelligence, neural network, training step, (16 more...)

Country: Asia > Japan (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

A Mean Field Algorithm for Bayes Learning in Large Feed-forward Neural Networks

Opper, Manfred, Winther, Ole

In the Bayes approach to statistical inference [Berger, 1985] one assumes that the prior uncertainty about parameters of an unknown data generating mechanism can be encoded in a probability distribution, the so called prior. Using the prior and the likelihood of the data given the parameters, the posterior distribution of the parameters can be derived from Bayes rule. From this posterior, various estimates for functions ofthe parameter, like predictions about unseen data, can be calculated. However, in general, those predictions cannot be realised by specific parameter values, but only by an ensemble average over parameters according to the posterior probability. Hence, exact implementations of Bayes method for neural networks require averages over network parameters which in general can be performed by time consuming 226 M. Opper and O. Winther Monte Carlo procedures.

approximation, bayesian inference, neural network, (17 more...)

Country:

Europe > Germany (0.14)
Europe > Denmark (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

A Mean Field Algorithm for Bayes Learning in Large Feed-forward Neural Networks

Opper, Manfred, Winther, Ole

In the Bayes approach to statistical inference [Berger, 1985] one assumes that the prior uncertainty about parameters of an unknown data generating mechanism can be encoded in a probability distribution, the so called prior. Using the prior and the likelihood of the data given the parameters, the posterior distribution of the parameters can be derived from Bayes rule. From this posterior, various estimates for functions ofthe parameter, like predictions about unseen data, can be calculated. However, in general, those predictions cannot be realised by specific parameter values, but only by an ensemble average over parameters according to the posterior probability. Hence,exact implementations of Bayes method for neural networks require averages over network parameters which in general can be performed by time consuming 226 M.Opper and O. Winther Monte Carlo procedures.

approximation, bayesian inference, neural network, (17 more...)

Country:

Europe > Germany (0.14)
Europe > Denmark (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Neural Information Processing SystemsDec-31-1992

Estimating Average-Case Learning Curves Using Bayesian, Statistical Physics and VC Dimension Methods

Haussler, David, Kearns, Michael, Opper, Manfred, Schapire, Robert

In this paper we investigate an average-case model of concept learning, and give results that place the popular statistical physics and VC dimension theories of learning curve behavior in a common framework.

algorithm, artificial intelligence, machine learning, (15 more...)

Country: North America > United States > California > Santa Cruz County > Santa Cruz (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Neural Information Processing SystemsDec-31-1992

Estimating Average-Case Learning Curves Using Bayesian, Statistical Physics and VC Dimension Methods

Haussler, David, Kearns, Michael, Opper, Manfred, Schapire, Robert

In this paper we investigate an average-case model of concept learning, and give results that place the popular statistical physics and VC dimension theories of learning curve behavior in a common framework.