AITopics

We propose an information-theoretic clustering approach that incorporates a pre-known partition of the data, aiming to identify common clusters that cut across the given partition. In the standard clustering setting the formation of clusters is guided by a single source of feature information. The newly utilized pre-partition factor introduces an additional bias that counterbalances the impact of the features whenever they become correlated with this known partition. The resulting algorithmic framework was applied successfully to synthetic data, as well as to identifying text-based cross-religion correspondences.

algorithm, artificial intelligence, machine learning, (19 more...)

Country:

Asia > Middle East > Israel (0.15)
North America > United States > New York (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)

Roth, Volker, Lange, Tilman

Feature Selection in Clustering Problems

A novel approach to combining clustering and feature selection is presented. Itimplements a wrapper strategy for feature selection, in the sense that the features are directly selected by optimizing the discriminative powerof the used partitioning algorithm. On the technical side, we present an efficient optimization algorithm with guaranteed local convergence property.The only free parameter of this method is selected by a resampling-based stability analysis. Experiments with real-world datasets demonstrate that our method is able to infer both meaningful partitions and meaningful subsets of features.

artificial intelligence, machine learning, partition, (17 more...)

Country: North America > United States (0.14)

Genre:

Research Report (0.48)
Overview (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Weston, Jason, Schölkopf, Bernhard, Bakir, Gökhan H.

Learning to Find Pre-Images

We consider the problem of reconstructing patterns from a feature map. Learning algorithms using kernels to operate in a reproducing kernel Hilbert space (RKHS) express their solutions in terms of input points mapped into the RKHS. We introduce a technique based on kernel principal componentanalysis and regression to reconstruct corresponding patterns inthe input space (aka pre-images) and review its performance in several applications requiring the construction of pre-images. The introduced techniqueavoids difficult and/or unstable numerical optimization, is easy to implement and, unlike previous methods, permits the computation ofpre-images in discrete input spaces.

artificial intelligence, kernel, machine learning, (15 more...)

Country:

North America > United States (0.47)
Europe (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data

Lawrence, Neil D.

In this paper we introduce a new underlying probabilistic model for principal componentanalysis (PCA). Our formulation interprets PCA as a particular Gaussian process prior on a mapping from a latent space to the observed data-space. We show that if the prior's covariance function constrainsthe mappings to be linear the model is equivalent to PCA, we then extend the model by considering less restrictive covariance functions whichallow nonlinear mappings. This more general Gaussian process latent variable model (GPLVM) is then evaluated as an approach to the visualisation of high dimensional data for three different data-sets. Additionally our nonlinear algorithm can be further kernelised leading to'twin kernel PCA' in which a mapping between feature spaces occurs.

artificial intelligence, latent space, machine learning, (12 more...)

Country:

North America > United States (0.15)
Europe > Switzerland (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.72)

Verbeek, Jakob J., Roweis, Sam T., Vlassis, Nikos

Non-linear CCA and PCA by Alignment of Local Models

We propose a nonlinear Canonical Correlation Analysis (CCA) method which works by coordinating or aligning mixtures of linear models. In the same way that CCA extends the idea of PCA, our work extends recent methodsfor nonlinear dimensionality reduction to the case where multiple embeddings of the same underlying low dimensional coordinates areobserved, each lying on a different high dimensional manifold. We also show that a special case of our method, when applied to only a single manifold, reduces to the Laplacian Eigenmaps algorithm. As with previous alignment schemes, once the mixture models have been estimated, all of the parameters of our model can be estimated in closed form without local optima in the learning. Experimental results illustrate the viability of the approach as a nonlinear extension of CCA.

artificial intelligence, machine learning, optimization problem, (18 more...)

Country: North America > Canada (0.28)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Hamerly, Greg, Elkan, Charles

Learning the k in k-means

When clustering a dataset, the right number k of clusters to use is often not obvious, and choosing k automatically is a hard algorithmic problem. Inthis paper we present an improved algorithm for learning k while clustering. The G-means algorithm is based on a statistical test for the hypothesis that a subset of data follows a Gaussian distribution. G-means runs k-means with increasing k in a hierarchical fashion until the test accepts thehypothesis that the data assigned to each k-means center are Gaussian. Two key advantages are that the hypothesis test does not limit the covariance of the data and does not compute a full covariance matrix. Additionally, G-means only requires one intuitive parameter, the standard statisticalsignificance level α. We present results from experiments showing that the algorithm works well, and better than a recent method based on the BIC penalty for model complexity. In these experiments, we show that the BIC is ineffective as a scoring function, since it does not penalize strongly enough the model's complexity.

algorithm, artificial intelligence, machine learning, (17 more...)

Country: North America > United States > California > San Francisco County > San Francisco (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.49)

Paciorek, Christopher J., Schervish, Mark J.

Nonstationary Covariance Functions for Gaussian Process Regression

While the mixture approach is intriguing, neither of [8, 9] compare their model to other methods.

artificial intelligence, bayesian inference, machine learning, (17 more...)

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.28)
Europe > United Kingdom > England (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Kemp, Charles, Griffiths, Thomas L., Stromsten, Sean, Tenenbaum, Joshua B.

Semi-Supervised Learning with Trees

We describe a nonparametric Bayesian approach to generalizing from few labeled examples, guided by a larger set of unlabeled objects and the assumption of a latent tree-structure to the domain. The tree (or a distribution over trees) may be inferred using the unlabeled data. A prior over concepts generated by a mutation process on the inferred tree(s) allows efficient computation of the optimal Bayesian classification function fromthe labeled examples. We test our approach on eight real-world datasets.

artificial intelligence, bayesian inference, machine learning, (16 more...)

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Sparse Representation and Its Applications in Blind Source Separation

Li, Yuanqing, Amari, Shun-ichi, Shishkin, Sergei, Cao, Jianting, Gu, Fanji, Cichocki, Andrzej S.

In this paper, sparse representation (factorization) of a data matrix is first discussed. An overcomplete basis matrix is estimated by using the K means method.

artificial intelligence, machine learning, matrix, (17 more...)

Country: Asia > Japan (0.15)

Industry: Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)