AITopics

This article presents a Support Vector Machine (SVM) like learning system tohandle multi-label problems. Such problems are usually decomposed intomany two-class problems but the expressive power of such a system can be weak [5, 7]. We explore a new direct approach. It is based on a large margin ranking system that shares a lot of common properties withSVMs. We tested it on a Yeast gene functional classification problem with positive results.

binary approach, boostexter, ranking system, (14 more...)

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.58)

Domingos, Pedro, Hulten, Geoff

Learning from Infinite Data in Finite Time

We propose the following general method for scaling learning algorithms to arbitrarily large data sets. We apply this method to the EM algorithm for mixtures of Gaussians. Preliminary experiments on a series of large data sets provide evidence of the potential of this approach. On the other hand, they require large computational resources to learn from. While in the past the factor limiting the quality of learnable models was typically the quantity of data available, in many domains today data is superabundant, and the bottleneck is t he time required to process it.

algorithm, iteration, probability, (16 more...)

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Washington > King County > Redmond (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Domeniconi, Carlotta, Gunopulos, Dimitrios

Adaptive Nearest Neighbor Classification Using Support Vector Machines

The nearest neighbor technique is a simple and appealing method to address classification problems. It relies on the assumption of locally constant class conditional probabilities. This assumption becomes invalid in high dimensions with a finite number of examples dueto the curse of dimensionality. We propose a technique that computes a locally flexible metric by means of Support Vector Machines (SVMs). The maximum margin boundary found by the SVM is used to determine the most discriminant direction over the query's neighborhood. Such direction provides a local weighting scheme for input features.

boundary, classification, support vector, (13 more...)

Country: North America > United States > California > Riverside County > Riverside (0.14)

Genre: Research Report (0.47)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Collins, Michael, Duffy, Nigel

Convolution Kernels for Natural Language

We describe the application of kernel methods to Natural Language Processing (NLP)problems. In many NLP tasks the objects being modeled are strings, trees, graphs or other discrete structures which require some mechanism to convert them into feature vectors. We describe kernels for various natural language structures, allowing rich, high dimensional representations ofthese structures. We show how a kernel over trees can be applied to parsing using the voted perceptron algorithm, and we give experimental results on the ATIS corpus of parse trees.

algorithm, convolution kernel, kernel, (12 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.05)
North America > United States > New Jersey (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Collins, Michael, Dasgupta, S., Schapire, Robert E.

A Generalization of Principal Components Analysis to the Exponential Family

Principal component analysis (PCA) is a commonly applied technique for dimensionality reduction. PCA implicitly minimizes a squared loss function, which may be inappropriate for data that is not real-valued, such as binary-valued data. This paper draws on ideas from the Exponential family,Generalized linear models, and Bregman distances, to give a generalization of PCA to loss functions that we argue are better suited to other data types. We describe algorithms for minimizing the loss functions, andgive examples on simulated data.

bregman distance, exponential family, pca, (16 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.61)

Fast Parameter Estimation Using Green's Functions

Wong, K., Li, F.

Simulationresults show that it is efficient and precise, when compared with cross-validation and other techniques which require matrix inversion.

cavity method, generalization error, green, (13 more...)

Country:

Asia > China > Hong Kong (0.05)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)

El-Yaniv, Ran, Souroujon, Oren

Iterative Double Clustering for Unsupervised and Semi-Supervised Learning

We present a powerful meta-clustering technique called Iterative Double Clustering(IDC). The IDC method is a natural extension of the recent Double Clustering (DC) method of Slonim and Tishby that exhibited impressiveperformance on text categorization tasks [12]. Using synthetically generated data we empirically find that whenever the DC procedure is successful in recovering some of the structure hidden in the data, the extended IDC procedure can incrementally compute a significantly more accurate classification. IDC is especially advantageous whenthe data exhibits high attribute noise. Our simulation results also show the effectiveness of IDC in text categorization problems. Surprisingly,this unsupervised procedure can be competitive with a (supervised) SVM trained with a small training set. Finally, we propose a simple and natural extension of IDC for semi-supervised and transductive learning where we are given both labeled and unlabeled examples.

artificial intelligence, idc, machine learning, (16 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.90)

Shimodaira, Hiroshi, Noma, Ken-ichi, Nakai, Mitsuru, Sagayama, Shigeki

Dynamic Time-Alignment Kernel in Support Vector Machine

A new class of Support Vector Machine (SVM) that is applicable to sequential-pattern recognition such as speech recognition is developed by incorporating an idea of nonlinear time alignment into the kernel function. Since the time-alignment operation of sequential pattern is embedded in the new kernel function, standard SVM training and classification algorithms can be employed without further modifications. The proposed SVM (DTAK-SVM) is evaluated in speaker-dependent speech recognition experiments of hand-segmented phoneme recognition. Preliminary experimental results show comparable recognition performance with hidden Markov models (HMMs).

artificial intelligence, machine learning, recognition, (17 more...)

Country: Asia > Japan (0.15)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.90)

Collobert, Ronan, Bengio, Samy, Bengio, Yoshua

A Parallel Mixture of SVMs for Very Large Scale Problems

Support Vector Machines (SVMs) are currently the state-of-the-art models for many classification problems but they suffer from the complexity of their training algorithmwhich is at least quadratic with respect to the number of examples. Hence, it is hopeless to try to solve real-life problems having more than a few hundreds of thousands examples with SVMs. The present paper proposes a new mixture of SVMs that can be easily implemented in parallel and where each SVM is trained on a small subset of the whole dataset. Experiments on a large benchmark dataset (Forest) as well as a difficult speech database, yielded significant time improvement (time complexity appears empirically to locally grow linearly with the number of examples) . In addition, and that is a surprise, a significant improvement in generalization was observed on Forest. 1 Introduction Recently a lot of work has been done around Support Vector Machines [9], mainly due to their impressive generalization performances on classification problems when compared to other algorithms such as artificial neural networks [3, 6].

artificial intelligence, machine learning, svm, (19 more...)

Country:

Oceania > Australia (0.28)
North America > Canada > Quebec (0.15)

Genre: Research Report (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Bi, J., Bennett, Kristin P.

Duality, Geometry, and Support Vector Regression

We develop an intuitive geometric framework for support vector regression (SVR). By examining when ɛ-tubes exist, we show that SVR can be regarded as a classification problem in the dual space. Hard and soft ɛ-tubes are constructed by separating the convex or reduced convex hulls respectively of the training data with the response variable shifted up and down by ɛ. A novel SVR model is proposed based on choosing the max-margin plane between the two shifted datasets.

artificial intelligence, convex hull, machine learning, (16 more...)

Country: North America > United States (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)