AITopics

Using methods of Statistical Physics, we investigate the rOle of model complexity in learning with support vector machines (SVMs). We show the advantages of using SVMs with kernels of infinite complexity on noisy target rules, which, in contrast to common theoretical beliefs, are found to achieve optimal generalization error although the training error does not converge to the generalization error. Moreover, we find a universal asymptotics of the learning curves which only depend on the target rule but not on the SVM kernel. 1 Introduction Powerful systems for data inference, like neural networks implement complex inputoutput relations by learning from example data. The price one has to pay for the flexibility of these models is the need to choose the proper model complexity for a given task, i.e. the system architecture which gives good generalization ability for novel data. This has become an important problem also for support vector machines [1].

artificial intelligence, kernel, machine learning, (17 more...)

Country: Europe > Germany (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

A Variational Approach to Learning Curves

Malzahn, Dörthe, Opper, Manfred

We combine the replica approach from statistical physics with a variational approach to analyze learning curves analytically. We apply the method to Gaussian process regression. As a main result we derive approximative relations between empirical error measures, the generalization error and the posterior variance.

artificial intelligence, machine learning, posterior variance, (16 more...)

Country: Europe > United Kingdom (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)

TAP Gibbs Free Energy, Belief Propagation and Sparsity

Csató, Lehel, Opper, Manfred, Winther, Ole

The adaptive TAP Gibbs free energy for a general densely connected probabilistic model with quadratic interactions and arbritary single site constraints is derived. We show how a specific sequential minimization of the free energy leads to a generalization of Minka's expectation propagation. Lastly, we derive a sparse representation version of the sequential algorithm. The usefulness of the approach is demonstrated on classification and density estimation with Gaussian processes and on an independent component analysis problem.

approximation, artificial intelligence, bayesian inference, (16 more...)

Country: Europe > Denmark (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

TAP Gibbs Free Energy, Belief Propagation and Sparsity

Csató, Lehel, Opper, Manfred, Winther, Ole

The adaptive TAP Gibbs free energy for a general densely connected probabilistic model with quadratic interactions and arbritary single site constraints is derived. We show how a specific sequential minimization of the free energy leads to a generalization of Minka's expectation propagation. Lastly,we derive a sparse representation version of the sequential algorithm. The usefulness of the approach is demonstrated on classification anddensity estimation with Gaussian processes and on an independent componentanalysis problem.

approximation, artificial intelligence, bayesian inference, (16 more...)

Country: Europe > Denmark (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Asymptotic Universality for Learning Curves of Support Vector Machines

Opper, Manfred, Urbanczik, Robert

Using methods of Statistical Physics, we investigate the rOle of model complexity in learning with support vector machines (SVMs). We show the advantages of using SVMs with kernels of infinite complexity on noisy target rules, which, in contrast to common theoretical beliefs, are found to achieve optimal generalization erroralthough the training error does not converge to the generalization error. Moreover, we find a universal asymptotics of the learning curves which only depend on the target rule but not on the SVM kernel. 1 Introduction Powerful systems for data inference, like neural networks implement complex inputoutput relationsby learning from example data. The price one has to pay for the flexibility of these models is the need to choose the proper model complexity for a given task, i.e. the system architecture which gives good generalization ability for novel data. This has become an important problem also for support vector machines [1].

artificial intelligence, kernel, machine learning, (17 more...)

Country:

Europe > Germany (0.14)
Europe > United Kingdom (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

A Variational Approach to Learning Curves

Malzahn, Dörthe, Opper, Manfred

We combine the replica approach from statistical physics with a variational approachto analyze learning curves analytically. We apply the method to Gaussian process regression. As a main result we derive approximative relationsbetween empirical error measures, the generalization error and the posterior variance.

artificial intelligence, machine learning, posterior variance, (16 more...)

Country: Europe > United Kingdom (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)

Sparse Representation for Gaussian Process Models

Csató, Lehel, Opper, Manfred

We develop an approach for a sparse representation for Gaussian Process (GP) models in order to overcome the limitations of GPs caused by large data sets.

artificial intelligence, bayesian inference, vector, (18 more...)

Country: Europe > United Kingdom (0.28)

Technology:

Information Technology > Modeling & Simulation (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Sparse Representation for Gaussian Process Models

Csató, Lehel, Opper, Manfred

We develop an approach for a sparse representation for Gaussian Process (GP) models in order to overcome the limitations of GPs caused by large data sets. The method is based on a combination of a Bayesian online algorithm together with a sequential construction of a relevant subsample of the data which fully specifies the prediction of the model. Experimental results on toy examples and large real-world data sets indicate the efficiency of the approach.

Learning Curves for Gaussian Processes Regression: A Framework for Good Approximations

Malzahn, Dörthe, Opper, Manfred

Based on a statistical mechanics approach, we develop a method for approximately computing average case learning curves for Gaussian processregression models. The approximation works well in the large sample size limit and for arbitrary dimensionality of the input space. We explain how the approximation can be systematically improvedand argue that similar techniques can be applied to general likelihood models. 1 Introduction Gaussian process (GP) models have gained considerable interest in the Neural Computation Community(see e.g.[I, 2, 3, 4]) in recent years. Being nonparametric models by construction their theoretical understanding seems to be less well developed comparedto simpler parametric models like neural networks. We are especially interested in developing theoretical approaches which will at least give good approximations togeneralization errors when the number of training data is sufficiently large.

approximation, artificial intelligence, survey article, (17 more...)

Country: Europe > United Kingdom (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Learning Curves for Gaussian Processes Regression: A Framework for Good Approximations

Malzahn, Dörthe, Opper, Manfred

Based on a statistical mechanics approach, we develop a method for approximately computing average case learning curves for Gaussian process regression models. The approximation works well in the large sample size limit and for arbitrary dimensionality of the input space. We explain how the approximation can be systematically improved and argue that similar techniques can be applied to general likelihood models. 1 Introduction Gaussian process (GP) models have gained considerable interest in the Neural Computation Community (see e.g.[I, 2, 3, 4]) in recent years. Being nonparametric models by construction their theoretical understanding seems to be less well developed compared to simpler parametric models like neural networks. We are especially interested in developing theoretical approaches which will at least give good approximations to generalization errors when the number of training data is sufficiently large. In this paper we present a step in this direction which is based on a statistical mechanics approach.

approximation, artificial intelligence, survey article, (17 more...)

Country: Europe > United Kingdom (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)