AITopics

0708.4311

Country:

North America > United States (1.00)
Europe (1.00)
Asia (0.68)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Chess (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

arXiv.org Machine LearningAug-16-2007

Piecewise linear regularized solution paths

Rosset, Saharon, Zhu, Ji

We consider the generic regularized optimization problem $\hat{\mathsf{\beta}}(\lambda)=\arg \min_{\beta}L({\sf{y}},X{\sf{\beta}})+\lambda J({\sf{\beta}})$. Efron, Hastie, Johnstone and Tibshirani [Ann. Statist. 32 (2004) 407--499] have shown that for the LASSO--that is, if $L$ is squared error loss and $J(\beta)=\|\beta\|_1$ is the $\ell_1$ norm of $\beta$--the optimal coefficient path is piecewise linear, that is, $\partial \hat{\beta}(\lambda)/\partial \lambda$ is piecewise constant. We derive a general characterization of the properties of (loss $L$, penalty $J$) pairs which give piecewise linear coefficient paths. Such pairs allow for efficient generation of the full regularized coefficient paths. We investigate the nature of efficient path following algorithms which arise. We use our results to suggest robust versions of the LASSO for regression and classification, and to develop new, efficient algorithms for existing problems in the literature, including Mammen and van de Geer's locally adaptive regression splines.

algorithm, artificial intelligence, machine learning, (19 more...)

doi: 10.1214/009053606000001370

0708.2197

Country: North America > United States (0.68)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.88)

Steinwart, Ingo, Scovel, Clint

Fast rates for support vector machines using Gaussian kernels

arXiv.org Machine LearningAug-14-2007

For binary classification we establish learning rates up to the order of $n^{-1}$ for support vector machines (SVMs) with hinge loss and Gaussian RBF kernels. These rates are in terms of two assumptions on the considered distributions: Tsybakov's noise assumption to establish a small estimation error, and a new geometric noise condition which is used to bound the approximation error. Unlike previously proposed concepts for bounding the approximation error, the geometric noise assumption does not employ any smoothness assumption.

artificial intelligence, machine learning, theorem 2, (16 more...)

doi: 10.1214/009053606000001226

0708.1838

Country: North America > United States (0.68)

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Chen, Aiyou, Bickel, Peter J.

Efficient independent component analysis

arXiv.org Machine LearningAug-2-2007

Independent component analysis (ICA) has been widely used for blind source separation in many fields such as brain imaging analysis, signal processing and telecommunication. Many statistical techniques based on M-estimates have been proposed for estimating the mixing matrix. Recently, several nonparametric methods have been developed, but in-depth analysis of asymptotic efficiency has not been available. We analyze ICA using semiparametric theories and propose a straightforward estimate based on the efficient score function by using B-spline approximations. The estimate is asymptotically efficient under moderate conditions and exhibits better performance than standard ICA methods in a variety of simulations.

artificial intelligence, component analysis, machine learning, (18 more...)

doi: 10.1214/009053606000000939

0705.4230

Country: North America > United States > California (0.46)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

arXiv.org Artificial IntelligenceJul-29-2007

A Leaf Recognition Algorithm for Plant Classification Using Probabilistic Neural Network

Wu, Stephen Gang, Bao, Forrest Sheng, Xu, Eric You, Wang, Yu-Xuan, Chang, Yi-Fan, Xiang, Qiao-Liang

In this paper, we employ Probabilistic Neural Network (PNN) with image and data processing techniques to implement a general purpose automated leaf recognition algorithm. 12 leaf features are extracted and orthogonalized into 5 principal variables which consist the input vector of the PNN. The PNN is trained by 1800 leaves to classify 32 kinds of plants with an accuracy greater than 90%. Compared with other approaches, our algorithm is an accurate artificial intelligence approach which is fast in execution and easy in implementation.

artificial intelligence, machine learning, vector, (18 more...)

0707.4289

Country:

Asia > China (0.48)
North America > United States > Pennsylvania (0.14)

Genre: Research Report (0.40)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceJul-26-2007

Learning Probabilistic Models of Word Sense Disambiguation

Pedersen, Ted

This dissertation presents several new methods of supervised and unsupervised learning of word sense disambiguation models. The supervised methods focus on performing model searches through a space of probabilistic models, and the unsupervised methods rely on the use of Gibbs Sampling and the Expectation Maximization (EM) algorithm. In both the supervised and unsupervised case, the Naive Bayesian model is found to perform well. An explanation for this success is presented in terms of learning rates and bias-variance decompositions.

accuracy, machine learning, natural language, (20 more...)

0707.3972

Country:

North America > United States > California (0.45)
Europe > United Kingdom > England (0.27)

Genre:

Summary/Review (1.00)
Research Report > Experimental Study (0.92)

Industry:

Banking & Finance (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.92)
Media (0.67)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Banerjee, Onureena, Ghaoui, Laurent El, d'Aspremont, Alexandre

Model Selection Through Sparse Maximum Likelihood Estimation

arXiv.org Artificial IntelligenceJul-4-2007

We consider the problem of estimating the parameters of a Gaussian or binary distribution in such a way that the resulting undirected graphical model is sparse. Our approach is to solve a maximum likelihood problem with an added l_1-norm penalty term. The problem as formulated is convex but the memory requirements and complexity of existing interior point methods are prohibitive for problems with more than tens of nodes. We present two new algorithms for solving problems with at least a thousand nodes in the Gaussian case. Our first algorithm uses block coordinate descent, and can be interpreted as recursive l_1-norm penalized regression. Our second algorithm, based on Nesterov's first order method, yields a complexity estimate with a better dependence on problem size than existing interior point methods. Using a log determinant relaxation of the log partition function (Wainwright & Jordan (2006)), we show that these same algorithms can be used to solve an approximate sparse maximum likelihood problem for the binary case. We test our algorithms on synthetic data, as well as on gene expression and senate voting records data.

artificial intelligence, bayesian inference, machine learning, (15 more...)

0707.0704

Country:

North America > United States > California (0.28)
Asia > Middle East > Jordan (0.25)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Steinwart, Ingo, Hush, Don, Scovel, Clint

Learning from dependent observations

arXiv.org Machine LearningJul-2-2007

In most papers establishing consistency for learning algorithms it is assumed that the observations used for training are realizations of an i.i.d. process. In this paper we go far beyond this classical framework by showing that support vector machines (SVMs) essentially only require that the data-generating process satisfies a certain law of large numbers. We then consider the learnability of SVMs for $\a$-mixing (not necessarily stationary) processes for both classification and regression, where for the latter we explicitly allow unbounded noise.

artificial intelligence, machine learning, stochastic process, (18 more...)

0707.0303

Country: North America > United States (1.00)

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.55)

Sriperumbudur, Bharath K., Lanckriet, Gert R. G.

Metric Embedding for Nearest Neighbor Classification

arXiv.org Machine LearningJun-24-2007

The distance metric plays an important role in nearest neighbor (NN) classification. Usually the Euclidean distance metric is assumed or a Mahalanobis distance metric is optimized to improve the NN performance. In this paper, we study the problem of embedding arbitrary metric spaces into a Euclidean space with the goal to improve the accuracy of the NN classifier. We propose a solution by appealing to the framework of regularization in a reproducing kernel Hilbert space and prove a representer-like theorem for NN classification. The embedding function is then determined by solving a semidefinite program which has an interesting connection to the soft-margin linear binary support vector machine classifier. Although the main focus of this paper is to present a general, theoretical framework for metric embedding in a NN setting, we demonstrate the performance of the proposed method on some benchmark datasets and show that it performs better than the Mahalanobis metric learning algorithm in terms of leave-one-out and generalization errors.

artificial intelligence, distance metric, machine learning, (18 more...)

0706.3499

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)

Shafer, Glenn, Vovk, Vladimir

A tutorial on conformal prediction

arXiv.org Machine LearningJun-21-2007

Conformal prediction uses past experience to determine precise levels of confidence in new predictions. Given an error probability $\epsilon$, together with a method that makes a prediction $\hat{y}$ of a label $y$, it produces a set of labels, typically containing $\hat{y}$, that also contains $y$ with probability $1-\epsilon$. Conformal prediction can be applied to any method for producing $\hat{y}$: a nearest-neighbor method, a support-vector machine, ridge regression, etc. Conformal prediction is designed for an on-line setting in which labels are predicted successively, each one being revealed before the next is predicted. The most novel and valuable feature of conformal prediction is that if the successive examples are sampled independently from the same distribution, then the successive predictions will be right $1-\epsilon$ of the time, even though they are based on an accumulating dataset rather than on independent datasets. In addition to the model under which successive examples are sampled independently, other on-line compression models can also use conformal prediction. The widely used Gaussian linear model is one of these. This tutorial presents a self-contained account of the theory of conformal prediction and works through several numerical examples. A more comprehensive treatment of the topic is provided in "Algorithmic Learning in a Random World", by Vladimir Vovk, Alex Gammerman, and Glenn Shafer (Springer, 2005).

artificial intelligence, machine learning, prediction, (17 more...)

0706.3188

Country: North America > United States (1.00)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education (0.50)
Leisure & Entertainment (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
(2 more...)