AITopics

In contrast to standard statistical learning theory which studies uniform bounds on the expected error we present a framework that exploits the specific learning algorithm used. Motivated by the luckiness framework [8] we are also able to exploit the serendipity of the training sample. The main difference to previous approaches lies in the complexity measure; rather than covering all hypotheses in a given hypothesis space it is only necessary to cover the functions which could have been learned using the fixed learning algorithm. We show how the resulting framework relates to the VC, luckiness and compression frameworks. Finally, we present an application of this framework to the maximum margin algorithm for linear classifiers which results in a bound that exploits both the margin and the distribution of the data in feature space. 1 Introduction Statistical learning theory is mainly concerned with the study of uniform bounds on the expected error of hypotheses from a given hypothesis space [9, 1].

algorithm, artificial intelligence, machine learning, (18 more...)

Country: North America > United States > California (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Online Learning with Kernels

Kivinen, Jyrki, Smola, Alex J., Williamson, Robert C.

Our method is computationally efficient and leads to simple algorithms.

algorithm, computer based training, educational technology, (22 more...)

Country: North America > United States (0.15)

Industry: Education > Educational Setting > Online (0.41)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)

Kernel Machines and Boolean Functions

Kowalczyk, Adam, Smola, Alex J., Williamson, Robert C.

We give results about the learnability and required complexity of logical formulae to solve classification problems. These results are obtained by linking propositional logic with kernel machines. In particular we show that decision trees and disjunctive normal forms (DNF) can be represented bythe help of a special kernel, linking regularized risk to separation margin. Subsequently we derive a number of lower bounds on the required complexity of logic formulae using properties of algorithms for generation of linear estimators, such as perceptron and maximal perceptron learning.

Algorithmic Luckiness

Herbrich, Ralf, Williamson, Robert C.

In contrast to standard statistical learning theory which studies uniform bounds on the expected error we present a framework that exploits the specific learning algorithm used. Motivated by the luckiness framework [8] we are also able to exploit the serendipity of the training sample. The main difference to previous approaches lies in the complexity measure; rather than covering all hypotheses ina given hypothesis space it is only necessary to cover the functions which could have been learned using the fixed learning algorithm. We show how the resulting framework relates to the VC, luckiness and compression frameworks. Finally, we present an application of this framework to the maximum margin algorithm for linear classifiers which results in a bound that exploits both the margin and the distribution of the data in feature space. 1 Introduction Statistical learning theory is mainly concerned with the study of uniform bounds on the expected error of hypotheses from a given hypothesis space [9, 1].

algorithm, artificial intelligence, machine learning, (17 more...)

Country:

North America > United States > California (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Regularization with Dot-Product Kernels

Smola, Alex J., Óvári, Zoltán L., Williamson, Robert C.

In this paper we give necessary and sufficient conditions under which kernels of dot product type k(x, y) k(x.

artificial intelligence, kernel, machine learning, (15 more...)

Country: North America > United States (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

From Margin to Sparsity

Graepel, Thore, Herbrich, Ralf, Williamson, Robert C.

We present an improvement of Novikoff's perceptron convergence theorem. Reinterpreting this mistake bound as a margin dependent sparsity guarantee allows us to give a PACstyle generalisation error bound for the classifier learned by the perceptron learning algorithm. The bound value crucially depends on the margin a support vector machine would achieve on the same data set using the same kernel. Ironically, the bound yields better guarantees than are currently available for the support vector solution itself.

artificial intelligence, generalisation error, neural network, (18 more...)

Country:

North America > United States > Wisconsin (0.14)
North America > United States > California (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.81)

Regularization with Dot-Product Kernels

Smola, Alex J., Óvári, Zoltán L., Williamson, Robert C.

In this paper we give necessary and sufficient conditions under which kernels of dot product type k(x, y) k(x.

artificial intelligence, kernel, machine learning, (15 more...)

Country: North America > United States (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Regularization with Dot-Product Kernels

Smola, Alex J., Óvári, Zoltán L., Williamson, Robert C.

In this paper we give necessary and sufficient conditions under which kernels of dot product type k(x, y) k(x .

artificial intelligence, kernel, machine learning, (14 more...)

Country: North America > United States (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

From Margin to Sparsity

Graepel, Thore, Herbrich, Ralf, Williamson, Robert C.

We present an improvement of Novikoff's perceptron convergence theorem. Reinterpreting this mistake bound as a margin dependent sparsity guarantee allows us to give a PACstyle generalisation error boundfor the classifier learned by the perceptron learning algorithm. Thebound value crucially depends on the margin a support vector machine would achieve on the same data set using the same kernel. Ironically, the bound yields better guarantees than are currently availablefor the support vector solution itself. 1 Introduction In the last few years there has been a large controversy about the significance of the attained margin, i.e. the smallest real valued output of a classifiers before thresholding, as an indicator of generalisation performance. Results in the YC, PAC and luckiness frameworks seem to indicate that a large margin is a prerequisite for small generalisation error bounds (see [14, 12]).

artificial intelligence, generalisation error, neural network, (17 more...)

Country:

North America > United States > Wisconsin (0.14)
North America > United States > California (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.84)

Neural Information Processing SystemsDec-31-2000

Support Vector Method for Novelty Detection

Schölkopf, Bernhard, Williamson, Robert C., Smola, Alex J., Shawe-Taylor, John, Platt, John C.

Suppose you are given some dataset drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S equals some a priori specified

algorithm, artificial intelligence, data mining, (18 more...)