AITopics

We propose an information-theoretic clustering approach that incorporates a pre-known partition of the data, aiming to identify common clusters that cut across the given partition. In the standard clustering setting the formation of clusters is guided by a single source of feature information. The newly utilized pre-partition factor introduces an additional bias that counterbalances the impact of the features whenever they become correlated with this known partition. The resulting algorithmic framework was applied successfully to synthetic data, as well as to identifying text-based cross-religion correspondences.

algorithm, artificial intelligence, machine learning, (19 more...)

Country:

Asia > Middle East > Israel (0.15)
North America > United States > New York (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)

Roth, Volker, Lange, Tilman

Feature Selection in Clustering Problems

A novel approach to combining clustering and feature selection is presented. Itimplements a wrapper strategy for feature selection, in the sense that the features are directly selected by optimizing the discriminative powerof the used partitioning algorithm. On the technical side, we present an efficient optimization algorithm with guaranteed local convergence property.The only free parameter of this method is selected by a resampling-based stability analysis. Experiments with real-world datasets demonstrate that our method is able to infer both meaningful partitions and meaningful subsets of features.

artificial intelligence, machine learning, partition, (17 more...)

Country: North America > United States (0.14)

Genre:

Research Report (0.48)
Overview (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Hamerly, Greg, Elkan, Charles

Learning the k in k-means

When clustering a dataset, the right number k of clusters to use is often not obvious, and choosing k automatically is a hard algorithmic problem. Inthis paper we present an improved algorithm for learning k while clustering. The G-means algorithm is based on a statistical test for the hypothesis that a subset of data follows a Gaussian distribution. G-means runs k-means with increasing k in a hierarchical fashion until the test accepts thehypothesis that the data assigned to each k-means center are Gaussian. Two key advantages are that the hypothesis test does not limit the covariance of the data and does not compute a full covariance matrix. Additionally, G-means only requires one intuitive parameter, the standard statisticalsignificance level α. We present results from experiments showing that the algorithm works well, and better than a recent method based on the BIC penalty for model complexity. In these experiments, we show that the BIC is ineffective as a scoring function, since it does not penalize strongly enough the model's complexity.

algorithm, artificial intelligence, machine learning, (17 more...)

Country: North America > United States > California > San Francisco County > San Francisco (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.49)

Sparse Representation and Its Applications in Blind Source Separation

Li, Yuanqing, Amari, Shun-ichi, Shishkin, Sergei, Cao, Jianting, Gu, Fanji, Cichocki, Andrzej S.

In this paper, sparse representation (factorization) of a data matrix is first discussed. An overcomplete basis matrix is estimated by using the K means method.

artificial intelligence, machine learning, matrix, (17 more...)

Country: Asia > Japan (0.15)

Industry: Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

Fischer, Bernd, Roth, Volker, Buhmann, Joachim M.

Clustering with the Connectivity Kernel

Clustering aims at extracting hidden structure in dataset. While the problem offinding compact clusters has been widely studied in the literature, extractingarbitrarily formed elongated structures is considered a much harder problem. In this paper we present a novel clustering algorithm whichtackles the problem by a two step procedure: first the data are transformed in such a way that elongated structures become compact ones. In a second step, these new objects are clustered by optimizing a compactness-based criterion. The advantages of the method over related approaches are threefold: (i) robustness properties of compactness-based criteria naturally transfer to the problem of extracting elongated structures, leadingto a model which is highly robust against outlier objects; (ii) the transformed distances induce a Mercer kernel which allows us to formulate a polynomial approximation scheme to the generally N P-hard clustering problem; (iii) the new method does not contain free kernel parameters in contrast to methods like spectral clustering or mean-shift clustering.

algorithm, artificial intelligence, machine learning, (18 more...)

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Journal of Artificial Intelligence ResearchOct-1-2004

Explicit Learning Curves for Transduction and Application to Clustering and Compression Algorithms

Derbeko, P., El-Yaniv, R., Meir, R.

Inductive learning is based on inferring a general rule from a finite data set and using it to label new data. In transduction one attempts to solve the problem of using a labeled training set to label a set of unlabeled points, which are given to the learner prior to learning. Although transduction seems at the outset to be an easier task than induction, there have not been many provably useful algorithms for transduction. Moreover, the precise relation between induction and transduction has not yet been determined. The main theoretical developments related to transduction were presented by Vapnik more than twenty years ago. One of Vapnik's basic results is a rather tight error bound for transductive classification based on an exact computation of the hypergeometric tail. While tight, this bound is given implicitly via a computational routine. Our first contribution is a somewhat looser but explicit characterization of a slightly extended PAC-Bayesian version of Vapnik's transductive bound. This characterization is obtained using concentration inequalities for the tail of sums of random variables obtained by sampling without replacement. We then derive error bounds for compression schemes such as (transductive) support vector machines and for transduction algorithms based on clustering. The main observation used for deriving these new error bounds and algorithms is that the unlabeled test points, which in the transductive setting are known in advance, can be used in order to construct useful data dependent prior distributions over the hypothesis space.

algorithm, transduction, vapnik, (16 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1417

AI Access Foundation

10387

Journal of Artificial Intelligence Research

Country:

North America > United States > New York (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Middle East > Israel > Haifa District > Haifa (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Molla, Michael, Waddell, Michael, Page, David, Shavlik, Jude

Using Machine Learning to Design and Interpret Gene-Expression Microarrays

AI MagazineMar-15-2004

Gene-expression microarrays, commonly called gene chips, make it possible to simultaneously measure the rate at which a cell or tissue is expressing -- translating into a protein -- each of its thousands of genes. One can use these comprehensive snapshots of biological activity to infer regulatory pathways in cells; identify novel targets for drug design; and improve the diagnosis, prognosis, and treatment planning for those suffering from disease. However, the amount of data this new technology produces is more than one can manually analyze. Hence, the need for automated analysis of microarray data offers an opportunity for machine learning to have a significant impact on biology and medicine. This article describes microarray technology, the data it produces, and the types of machine learning tasks that naturally arise with these data. It also reviews some of the recent prominent applications of machine learning to gene-chip data, points to related tasks where machine learning might have a further impact on biology and medicine, and describes additional types of interesting data that recent advances in biotechnology allow biomedical researchers to collect.

bioinformatics, experiment, machine learning, (18 more...)

AI Magazine

Country: North America > United States > California (0.28)

Genre: Overview (0.46)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Biomedical Informatics > Translational Bioinformatics (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
(2 more...)

Neural Information Processing SystemsDec-31-2003

An Impossibility Theorem for Clustering

Kleinberg, Jon M.

Although the study of clustering is centered around an intuitively compelling goal, it has been very difficult to develop a unified framework for reasoning about it at a technical level, and profoundly diverse approaches to clustering abound in the research community. Here we suggest a formal perspective on the difficulty in finding such a unification, in the form of an impossibility theorem: for a set of three simple properties, we show that there is no clustering function satisfying all three. Relaxations of these properties expose some of the interesting (and unavoidable) tradeoffs at work in well-studied clustering techniques such as single-linkage, sum-of-pairs, k-means, and k-median.

consistency, partition, satisfy scale-invariance, (16 more...)

Country: North America > United States > New York > Tompkins County > Ithaca (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Tsuda, Koji, Kawanabe, Motoaki, Müller, Klaus-Robert

Clustering with the Fisher Score

Neural Information Processing SystemsDec-31-2003

Recently the Fisher score (or the Fisher kernel) is increasingly used as a feature extractor for classification problems. The Fisher score is a vector of parameter derivatives of loglikelihood of a probabilistic model. This paper gives a theoretical analysis about how class information is preserved in the space of the Fisher score, which turns out that the Fisher score consists of a few important dimensions with class information and many nuisance dimensions. When we perform clustering with the Fisher score, K-Means type methods are obviously inappropriate because they make use of all dimensions. So we will develop a novel but simple clustering algorithm specialized for the Fisher score, which can exploit important dimensions. This algorithm is successfully tested in experiments with artificial data and real data (amino acid sequences).

dimension, fisher score, nuisance dimension, (15 more...)

Country:

Europe > Germany > Brandenburg > Potsdam (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Berlin (0.04)
(2 more...)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Chennubhotla, Chakra, Jepson, Allan D.

Half-Lives of EigenFlows for Spectral Clustering

Neural Information Processing SystemsDec-31-2003

Using a Markov chain perspective of spectral clustering we present an algorithm to automatically find the number of stable clusters in a dataset. The Markov chain's behaviour is characterized by the spectral properties of the matrix of transition probabilities, from which we derive eigenflows along with their halflives. An eigenflow describes the flow of probability mass due to the Markov chain, and it is characterized by its eigenvalue, or equivalently, by the halflife of its decay as the Markov chain is iterated. A ideal stable cluster is one with zero eigenflow and infinite half-life. The key insight in this paper is that bottlenecks between weakly coupled clusters can be identified by computing the sensitivity of the eigenflow's halflife to variations in the edge weights.

algorithm, eigenvector, sensitivity, (17 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > New York (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.70)