AITopics

We propose an information-theoretic clustering approach that incorporates a pre-known partition of the data, aiming to identify common clusters that cut across the given partition. In the standard clustering setting the formation of clusters is guided by a single source of feature information. The newly utilized pre-partition factor introduces an additional bias that counterbalances the impact of the features whenever they become correlated with this known partition. The resulting algorithmic framework was applied successfully to synthetic data, as well as to identifying text-based cross-religion correspondences.

algorithm, configuration, information, (17 more...)

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)

Kauchak, David, Dasgupta, Sanjoy

An Iterative Improvement Procedure for Hierarchical Clustering

We describe a procedure which finds a hierarchical clustering by hillclimbing. The cost function we use is a hierarchical extension of the k-means cost; our local moves are tree restructurings and node reorderings. We show these can be accomplished efficiently, by exploiting special properties of squared Euclidean distances and by using techniques from scheduling algorithms.

average linkage, cost function, node, (14 more...)

Country: North America > United States > California > San Diego County > San Diego (0.05)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.74)

Roth, Volker, Lange, Tilman

Feature Selection in Clustering Problems

A novel approach to combining clustering and feature selection is presented. It implements a wrapper strategy for feature selection, in the sense that the features are directly selected by optimizing the discriminative power of the used partitioning algorithm. On the technical side, we present an efficient optimization algorithm with guaranteed local convergence property. The only free parameter of this method is selected by a resampling-based stability analysis. Experiments with real-world datasets demonstrate that our method is able to infer both meaningful partitions and meaningful subsets of features.

algorithm, partition, selection, (14 more...)

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)

Genre:

Research Report (0.48)
Overview (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Bach, Francis R., Jordan, Michael I.

Learning Spectral Clustering

Spectral clustering refers to a class of techniques which rely on the eigenstructure of a similarity matrix to partition points into disjoint clusters with points in the same cluster having high similarity and points in different clusters having low similarity. In this paper, we derive a new cost function for spectral clustering based on a measure of error between a given partition and a solution of the spectral relaxation of a minimum normalized cut problem. Minimizing this cost function with respect to the partition leads to a new spectral clustering algorithm. Minimizing with respect to the similarity matrix leads to an algorithm for learning the similarity matrix. We develop a tractable approximation of our cost function that is based on the power method of computing eigenvectors.

algorithm, matrix, similarity matrix, (16 more...)

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Asia > Middle East > Jordan (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Sparse Representation and Its Applications in Blind Source Separation

Li, Yuanqing, Amari, Shun-ichi, Shishkin, Sergei, Cao, Jianting, Gu, Fanji, Cichocki, Andrzej S.

In this paper, sparse representation (factorization) of a data matrix is first discussed. An overcomplete basis matrix is estimated by using the K means method.

matrix, sparse representation, vector, (15 more...)

Country:

Asia > Japan > Honshū > Kantō > Saitama Prefecture > Saitama (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Industry: Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

Fischer, Bernd, Roth, Volker, Buhmann, Joachim M.

Clustering with the Connectivity Kernel

Clustering aims at extracting hidden structure in dataset. While the problem of finding compact clusters has been widely studied in the literature, extracting arbitrarily formed elongated structures is considered a much harder problem. In this paper we present a novel clustering algorithm which tackles the problem by a two step procedure: first the data are transformed in such a way that elongated structures become compact ones. In a second step, these new objects are clustered by optimizing a compactness-based criterion. The advantages of the method over related approaches are threefold: (i) robustness properties of compactness-based criteria naturally transfer to the problem of extracting elongated structures, leading to a model which is highly robust against outlier objects; (ii) the transformed distances induce a Mercer kernel which allows us to formulate a polynomial approximation scheme to the generally N P-hard clustering problem; (iii) the new method does not contain free kernel parameters in contrast to methods like spectral clustering or mean-shift clustering.

algorithm, dissimilarity, elongated structure, (16 more...)

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Bach, Francis R., Jordan, Michael I.

Learning Spectral Clustering

Spectral clustering refers to a class of techniques which rely on the eigenstructure ofa similarity matrix to partition points into disjoint clusters with points in the same cluster having high similarity and points in different clustershaving low similarity. In this paper, we derive a new cost function for spectral clustering based on a measure of error between a given partition and a solution of the spectral relaxation of a minimum normalized cut problem. Minimizing this cost function with respect to the partition leads to a new spectral clustering algorithm. Minimizing with respect to the similarity matrix leads to an algorithm for learning the similarity matrix. We develop a tractable approximation of our cost function that is based on the power method of computing eigenvectors.

artificial intelligence, machine learning, matrix, (19 more...)

Country: North America > United States > California > Alameda County > Berkeley (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Kauchak, David, Dasgupta, Sanjoy

An Iterative Improvement Procedure for Hierarchical Clustering

We describe a procedure which finds a hierarchical clustering by hillclimbing. Thecost function we use is a hierarchical extension of the k-means cost; our local moves are tree restructurings and node reorderings. Weshow these can be accomplished efficiently, by exploiting special properties of squared Euclidean distances and by using techniques from scheduling algorithms.

artificial intelligence, cost function, machine learning, (15 more...)

Country: North America > United States > California (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.74)

Derbeko, Philip, El-Yaniv, Ran, Meir, Ron

Error Bounds for Transductive Learning via Compression and Clustering

This paper is concerned with transductive learning. Although transduction appears to be an easier task than induction, there have not been many provably useful algorithms and bounds for transduction. We present explicit error bounds for transduction and derive a general technique for devising bounds within this setting. The technique is applied to derive error bounds for compression schemes such as (transductive) SVMs and for transduction algorithms based on clustering.

algorithm, artificial intelligence, machine learning, (14 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.49)