AITopics

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
North America > United States > New Jersey > Middlesex County > Piscataway (0.05)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Fass, David, Feldman, Jacob

Categorization Under Complexity: A Unified MDL Account of Human Learning of Regular and Irregular Categories

Neural Information Processing SystemsDec-31-2003

We present an account of human concept learning-that is, learning of categories from examples-based on the principle of minimum description length(MDL). In support of this theory, we tested a wide range of two-dimensional concept types, including both regular (simple) and highly irregular (complex) structures, and found the MDL theory to give a good account of subjects' performance. This suggests that the intrinsic complexityofa concept (that is, its description -length) systematically influences its leamability.

artificial intelligence, bayesian inference, machine learning, (16 more...)

Country: North America > United States (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Artificial IntelligenceJun-7-2003

Sequence Prediction based on Monotone Complexity

Hutter, Marcus

This paper studies sequence prediction based on the monotone Kolmogorov complexity Km=-log m, i.e. based on universal deterministic/one-part MDL. m is extremely close to Solomonoff's prior M, the latter being an excellent predictor in deterministic as well as probabilistic environments, where performance is measured in terms of convergence of posteriors or losses. Despite this closeness to M, it is difficult to assess the prediction quality of m, since little is known about the closeness of their posteriors, which are the important quantities for prediction. We show that for deterministic computable environments, the "posterior" and losses of m converge, but rapid convergence could only be shown on-sequence; the off-sequence behavior is unclear. In probabilistic environments, neither the posterior nor the losses converge, in general.

artificial intelligence, machine learning, prediction, (16 more...)

arXiv.org Artificial Intelligence

cs/0306036

Country:

Europe > Switzerland (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.36)

Herbrich, Ralf, Graepel, Thore

Large Scale Bayes Point Machines

Subsequently, SVMs have been modified to handle regression [12] and GPs have been adapted to the problem of classification [8]. Both schemes essentially work in the same function space that is characterised by kernels (SVM) and covariance functions (GP), respectively. While the formal similarity of the two methods is striking the underlying paradigms of inference are very different. The SVM was inspired by results from statistical/PAC learning theory while GPs are usually considered in a Bayesian framework. This ideological clash can be viewed as a continuation in machine learning of the by now classical disagreement between Bayesian and frequentistic statistics.

algorithm, classifier, generalisation error, (15 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Ben-David, Shai, Simon, Hans-Ulrich

Efficient Learning of Linear Perceptrons

The resulting combinatorial problem - finding the best agreement half-space over an input sample - is NP hard to approximate to within some constant factor. We suggest a way to circumvent this theoretical bound by introducing a new measure of success for such algorithms. An algorithm is ILmargin successful if the agreement ratio of the half-space it outputs is as good as that of any half-space once training points that are inside the ILmargins of its separating hyper-plane are disregarded. We prove crisp computational complexity results with respect to this success measure: On one hand, for every positive IL, there exist efficient (poly-time) ILmargin successful learning algorithms. On the other hand, we prove that unless P NP, there is no algorithm that runs in time polynomial in the sample size and in 1/ IL that is ILmargin successful for all IL O. 1 Introduction We consider the computational complexity of learning linear perceptrons for arbitrary (Le.

algorithm, polynomial time, reduction, (13 more...)

Country:

Europe > Germany (0.04)
Asia > Middle East > Israel > Haifa District > Haifa (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.49)

Herbrich, Ralf, Graepel, Thore

Large Scale Bayes Point Machines

Subsequently, SVMs have been modified to handle regression [12] and GPs have been adapted to the problem of classification [8]. Both schemes essentially work in the same function space that is characterised by kernels (SVM) and covariance functions (GP), respectively. While the formal similarity of the two methods is striking the underlying paradigms of inference are very different. The SVM was inspired by results from statistical/PAC learning theory while GPs are usually considered in a Bayesian framework. This ideological clash can be viewed as a continuation in machine learning of the by now classical disagreement between Bayesian and frequentistic statistics.

algorithm, classifier, generalisation error, (15 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Ben-David, Shai, Simon, Hans-Ulrich

Efficient Learning of Linear Perceptrons

The resulting combinatorial problem - finding the best agreement half-space over an input sample - is NP hard to approximate to within some constant factor. We suggest a way to circumvent this theoretical bound by introducing a new measure of success for such algorithms. An algorithm is ILmargin successful if the agreement ratio of the half-space it outputs is as good as that of any half-space once training points that are inside the ILmargins of its separating hyper-plane are disregarded. We prove crisp computational complexity results with respect to this success measure: On one hand, for every positive IL, there exist efficient (poly-time) ILmargin successful learning algorithms. On the other hand, we prove that unless P NP, there is no algorithm that runs in time polynomial in the sample size and in 1/ IL that is ILmargin successful for all IL O. 1 Introduction We consider the computational complexity of learning linear perceptrons for arbitrary (Le.

algorithm, polynomial time, reduction, (13 more...)

Country:

Europe > Germany (0.04)
Asia > Middle East > Israel > Haifa District > Haifa (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.49)

Bousquet, Olivier, Elisseeff, André

Algorithmic Stability and Generalization Performance

A stable learner is one for which the learned solution does not change much with small changes in the training set. The bounds we obtain do not depend on any measure of the complexity of the hypothesis space (e.g. VC dimension) but rather depend on how the learning algorithm searches this space, and can thus be applied even when the VC dimension is infinite. We demonstrate that regularization networks possess the required stability property and apply our method to obtain new bounds on their generalization performance.

algorithm, artificial intelligence, machine learning, (15 more...)

Country: North America > United States > New York (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Ben-David, Shai, Simon, Hans-Ulrich

Efficient Learning of Linear Perceptrons

The resulting combinatorial problem - finding the best agreement half-space over an input sample - is NP hard to approximate to within some constant factor. We suggest a way to circumvent this theoretical bound by introducing a new measure of success for such algorithms. An algorithm is ILmargin successful if the agreement ratio of the half-space it outputs is as good as that of any half-space once training points that are inside the ILmargins of its separating hyper-plane are disregarded. We prove crisp computational complexity resultswith respect to this success measure: On one hand, for every positive IL, there exist efficient (poly-time) ILmargin successful learningalgorithms. On the other hand, we prove that unless P NP, there is no algorithm that runs in time polynomial in the sample size and in 1/IL that is ILmargin successful for all IL O. 1 Introduction We consider the computational complexity of learning linear perceptrons for arbitrary (Le.non -separable) data sets.

algorithm, artificial intelligence, machine learning, (15 more...)

Country: Asia > Middle East > Israel (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.49)