AITopics

Country:

North America > United States > Wisconsin > Dane County > Madison (0.29)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > New York (0.05)
(3 more...)

Genre: Research Report (0.48)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Xing, Eric P., Jordan, Michael I., Russell, Stuart J., Ng, Andrew Y.

Distance Metric Learning with Application to Clustering with Side-Information

Many algorithms rely critically on being given a good metric over their inputs. For instance, data can often be clustered in many "plausible" ways, and if a clustering algorithm such as K-means initially fails to find one that is meaningful to a user, the only recourse may be for the user to manually tweak the metric until sufficiently good clusters are found. For these and other applications requiring good metrics, it is desirable that we provide a more systematic way for users to indicate what they consider "similar." For instance, we may ask them to provide examples.

algorithm, distance metric, k-means, (14 more...)

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)

Mean Field Approach to a Probabilistic Model in Information Retrieval

Wu, Bin, Wong, K., Bodoff, David

We study an explicit parametric model of documents, queries, and relevancy assessment for Information Retrieval (IR). Mean-field methods are applied to analyze the model and derive efficient practical algorithms to estimate the parameters in the problem. The hyperparameters are estimated by a fast approximate leave-one-out cross-validation procedure based on the cavity method. The algorithm is further evaluated on several benchmark databases by comparing with standard algorithms in IR.

document and query, estimation, hyperparameter, (12 more...)

Country:

Asia > China > Hong Kong (0.06)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > New York (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.73)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.52)

Ghahramani, Zoubin, Rasmussen, Carl E.

Bayesian Monte Carlo

We investigate Bayesian alternatives to classical Monte Carlo methods for evaluating integrals. Bayesian Monte Carlo (BMC) allows the incorporation of prior knowledge, such as smoothness of the integrand, into the estimation. In a simple problem we show that this outperforms any classical importance sampling method. We also attempt more challenging multidimensional integrals involved in computing marginal likelihoods of statistical models (a.k.a.

carlo, likelihood, monte carlo, (12 more...)

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Ong, Cheng S., Williamson, Robert C., Smola, Alex J.

Hyperkernels

We consider the problem of choosing a kernel suitable for estimation using a Gaussian Process estimator or a Support Vector Machine. A novel solution is presented which involves defining a Reproducing Kernel Hilbert Space on the space of kernels itself. By utilizing an analog of the classical representer theorem, the problem of choosing a kernel from a parameterized family of kernels (e.g. of varying width) is reduced to a statistical estimation problem akin to the problem of minimizing a regularized risk functional. Various classical settings for model or kernel selection are special cases of our framework.

hyperkernel, kernel, reproducing kernel hilbert space, (13 more...)

Country:

Asia > Middle East > Jordan (0.04)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.56)

Cesa-bianchi, Nicolò, Conconi, Alex, Gentile, Claudio

Margin-Based Algorithms for Information Filtering

In this work, we study an information filtering model where the relevance labels associated to a sequence of feature vectors are realizations of an unknown probabilistic linear function. Building on the analysis of a restricted version of our model, we derive a general filtering rule based on the margin of a ridge regression estimator. While our rule may observe the label of a vector only by classfying the vector as relevant, experiments on a real-world document filtering problem show that the performance of our rule is close to that of the online classifier which is allowed to observe all labels. These empirical results are complemented by a theoretical analysis where we consider a randomized variant of our rule and prove that its expected number of mistakes is never much larger than that of the optimal filtering rule which knows the hidden linear model.

algorithm, fil, probability, (14 more...)

Country:

Europe > Italy (0.05)
North America > United States > New York (0.04)

Industry: Education > Educational Setting > Online (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.51)

Crammer, Koby, Gilad-bachrach, Ran, Navot, Amir, Tishby, Naftali

Margin Analysis of the LVQ Algorithm

Prototypes based algorithms are commonly used to reduce the computational complexity of Nearest-Neighbour (NN) classifiers. In this paper we discuss theoretical and algorithmical aspects of such algorithms. On the theory side, we present margin based generalization bounds that suggest that these kinds of classifiers can be more accurate then the 1-NN rule. Furthermore, we derived a training algorithm that selects a good set of prototypes using large margin principles. We also show that the 20 years old Learning Vector Quantization (LVQ) algorithm emerges naturally from our framework.

algorithm, loss function, prototype, (15 more...)

Country:

North America > United States > New York (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Effective Dimension and Generalization of Kernel Learning

Zhang, Tong

We investigate the generalization performance of some learning problems in Hilbert function Spaces. We introduce a concept of scalesensitive effective data dimension, and show that it characterizes the convergence rate of the underlying learning problem. Using this concept, we can naturally extend results for parametric estimation problems in finite dimensional spaces to nonparametric kernel learning methods. We derive upper bounds on the generalization performance and show that the resulting convergent rates are optimal under various circumstances.

dimension, hilbert space, inequality, (15 more...)

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Education > Focused Education > Special Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Langford, John, Shawe-Taylor, John

PAC-Bayes & Margins

There are two mathematical flavors of margin bound dependent upon the weights Wi of the vote and the features Xi that the vote is taken over. The results here are of the "bll2" form. We improve on Shawe-Taylor et al. [12] and Bartlett [1] by a log(m)2 sample complexity factor and much tighter constants (1000 or unstated versus 9 or 18 as suggested by Section 2.2). In addition, the bound here covers margin errors without weakening the error-free case. Herbrich and Graepel [3] moved significantly towards the approach adopted in our paper, but the methodology adopted meant that their result does not scale well to high dimensional feature spaces as the bound here (and earlier results) do.

classifier, error rate, stochastic classifier, (15 more...)

Country: North America > United States > New York (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)