AITopics

We develop an intuitive geometric framework for support vector regression (SVR). By examining when ɛ-tubes exist, we show that SVR can be regarded as a classification problem in the dual space. Hard and soft ɛ-tubes are constructed by separating the convex or reduced convex hulls respectively of the training data with the response variable shifted up and down by ɛ. A novel SVR model is proposed based on choosing the max-margin plane between the two shifted datasets.

convex hull, plane, rc-svr, (15 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > New York > Rensselaer County > Troy (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Belkin, Mikhail, Niyogi, Partha

Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering

Drawing on the correspondence between the graph Laplacian, the Laplace-Beltrami operator on a manifold, and the connections to the heat equation, we propose a geometrically motivated algorithm for constructing a representation for data sampled from a low dimensional manifold embedded in a higher dimensional space. The algorithm provides a computationally efficient approach to nonlinear dimensionality reduction that has locality preserving properties and a natural connection to clustering.

graph, laplacian, manifold, (12 more...)

Country:

North America > United States > Illinois > Cook County > Chicago (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Data Science > Data Mining (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)

Bach, Francis R., Jordan, Michael I.

Thin Junction Trees

We present an algorithm that induces a class of models with thin junction trees--models that are characterized by an upper bound on the size of the maximal cliques of their triangulated graph. By ensuring that the junction tree is thin, inference in our models remains tractable throughout the learning process. This allows both an efficient implementation of an iterative scaling parameter estimation algorithm and also ensures that inference can be performed efficiently with the final model. We illustrate the approach with applications in handwritten digit recognition and DNA splice site detection.

algorithm, junction tree, thin junction tree, (14 more...)

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > New York (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.95)

Andrieu, Christophe, Freitas, Nando D., Doucet, Arnaud

Rao-Blackwellised Particle Filtering via Data Augmentation

SMC is often referred to as particle filtering (PF) in the context of computing filtering distributions for statistical inference and learning. It is known that the performance of PF often deteriorates in high-dimensional state spaces. In the past, we have shown that if a model admits partial analytical tractability, it is possible to combine PF with exact algorithms (Kalman filters, HMM filters, junction tree algorithm) to obtain efficient high dimensional filters (Doucet, de Freitas, Murphy and Russell 2000, Doucet, Godsill and Andrieu 2000). In particular, we exploited a marginalisation technique known as Rao-Blackwellisation (RB). Here, we attack a more complex model that does not admit immediate analytical tractability.

algorithm, doucet, particle, (13 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > New York (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)

d', Alché-Buc, Florence, Grandvalet, Yves, Ambroise, Christophe

Semi-supervised MarginBoost

In many discrimination problems a large amount of data is available but only a few of them are labeled. This provides a strong motivation to improve or develop methods for semi-supervised learning. In this paper, boosting is generalized to this task within the optimization framework of MarginBoost. We extend the margin definition to unlabeled data and develop the gradient descent algorithm that corresponds to the resulting margin cost function. This meta-learning scheme can be applied to any base classifier able to benefit from unlabeled data. We propose here to apply it to mixture models trained with an Expectation-Maximization algorithm. Promising results are presented on benchmarks with different rates of labeled data.

algorithm, base classifier, unlabeled data, (14 more...)

Country:

Europe > France > Hauts-de-France > Oise > Compiègne (0.05)
North America > United States > California (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.80)

Generalization Performance of Some Learning Problems in Hilbert Functional Spaces

Zhang, T.

We investigate the generalization performance of some learning problems in Hilbert functional Spaces. We introduce a notion of convergence of the estimated functional predictor to the best underlying predictor, and obtain an estimate on the rate of the convergence. This estimate allows us to derive generalization bounds on some learning formulations.

inequality, probability, theorem 3, (14 more...)

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Education > Focused Education > Special Education (0.61)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Opper, Manfred, Urbanczik, Robert

Asymptotic Universality for Learning Curves of Support Vector Machines

Using methods of Statistical Physics, we investigate the rOle of model complexity in learning with support vector machines (SVMs). We show the advantages of using SVMs with kernels of infinite complexity on noisy target rules, which, in contrast to common theoretical beliefs, are found to achieve optimal generalization error although the training error does not converge to the generalization error. Moreover, we find a universal asymptotics of the learning curves which only depend on the target rule but not on the SVM kernel. 1 Introduction Powerful systems for data inference, like neural networks implement complex inputoutput relations by learning from example data. The price one has to pay for the flexibility of these models is the need to choose the proper model complexity for a given task, i.e. the system architecture which gives good generalization ability for novel data. This has become an important problem also for support vector machines [1].

generalization error, kernel, target rule, (15 more...)

Country:

Europe > Germany > Bavaria > Lower Franconia > Würzburg (0.05)
Europe > United Kingdom (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Malzahn, Dörthe, Opper, Manfred

A Variational Approach to Learning Curves

We combine the replica approach from statistical physics with a variational approach to analyze learning curves analytically. We apply the method to Gaussian process regression. As a main result we derive approximative relations between empirical error measures, the generalization error and the posterior variance.

free energy, kernel, posterior variance, (14 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Singapore (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)

Kowalczyk, Adam, Smola, Alex J., Williamson, Robert C.

Kernel Machines and Boolean Functions

We give results about the learnability and required complexity of logical formulae to solve classification problems. These results are obtained by linking propositional logic with kernel machines. In particular we show that decision trees and disjunctive normal forms (DNF) can be represented by the help of a special kernel, linking regularized risk to separation margin. Subsequently we derive a number of lower bounds on the required complexity of logic formulae using properties of algorithms for generation of linear estimators, such as perceptron and maximal perceptron learning.

algorithm, decision tree, perceptron, (14 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.73)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.61)

Khardon, Roni, Roth, Dan, Servedio, Rocco A.

Efficiency versus Convergence of Boolean Kernels for On-Line Learning Algorithms

We study online learning in Boolean domains using kernels which capture feature expansions equivalent to using conjunctions over basic features. We demonstrate a tradeoff between the computational efficiency with which these kernels can be computed and the generalization ability of the resulting classifier. We first describe several kernel functions which capture either limited forms of conjunctions or all conjunctions. We show that these kernels can be used to efficiently run the Perceptron algorithm over an exponential number of conjunctions; however we also prove that using such kernels the Perceptron algorithm can make an exponential number of mistakes even when learning simple functions. We also consider an analogous use of kernel functions to run the multiplicative-update Winnow algorithm over an expanded feature space of exponentially many conjunctions. While known upper bounds imply that Winnow can learn DNF formulae with a polynomial mistake bound in this setting, we prove that it is computationally hard to simulate Winnow's behavior for learning DNF over such a feature set, and thus that such kernel functions for Winnow are not efficiently computable.

algorithm, conjunction, perceptron algorithm, (15 more...)