Clustering
Fusion of Similarity Data in Clustering
Lange, Tilman, Buhmann, Joachim M.
Fusing multiple information sources can yield significant benefits to successfully accomplish learning tasks. Many studies have focussed on fusing information in supervised learning contexts. We present an approach to utilize multiple information sources in the form of similarity data for unsupervised learning. Based on similarity information, the clustering task is phrased as a nonnegative matrix factorization problem of a mixture of similarity measurements. The tradeoff between the informativeness of data sources and the sparseness of their mixture is controlled by an entropy-based weighting mechanism. For the purpose of model selection, a stability-based approach is employed to ensure the selection of the most self-consistent hypothesis. The experiments demonstrate the performance of the method on toy as well as real world data sets.
Generalization in Clustering with Unobserved Features
We argue that when objects are characterized by many attributes, clustering them on the basis of a relatively small random subset of these attributes can capture information on the unobserved attributes as well. Moreover, we show that under mild technical conditions, clustering the objects on the basis of such a random subset performs almost as well as clustering with the full attribute set. We prove a finite sample generalization theorems for this novel learning scheme that extends analogous results from the supervised learning setting. The scheme is demonstrated for collaborative filtering of users with movies rating as attributes.
A Probabilistic Approach for Optimizing Spectral Clustering
Jin, Rong, Kang, Feng, Ding, Chris H.
Spectral clustering enjoys its success in both data clustering and semisupervised learning. But, most spectral clustering algorithms cannot handle multi-class clustering problems directly. Additional strategies are needed to extend spectral clustering algorithms to multi-class clustering problems. Furthermore, most spectral clustering algorithms employ hard cluster membership, which is likely to be trapped by the local optimum. In this paper, we present a new spectral clustering algorithm, named "Soft Cut". It improves the normalized cut algorithm by introducing soft membership, and can be efficiently computed using a bound optimization algorithm. Our experiments with a variety of datasets have shown the promising performance of the proposed clustering algorithm.
Mixture Modeling by Affinity Propagation
Frey, Brendan J., Dueck, Delbert
Clustering is a fundamental problem in machine learning and has been approached in many ways. Two general and quite different approaches include iteratively fitting a mixture model (e.g., using EM) and linking together pairs of training cases that have high affinity (e.g., using spectral methods). Pairwise clustering algorithms need not compute sufficient statistics and avoid poor solutions by directly placing similar examples in the same cluster. However, many applications require that each cluster of data be accurately described by a prototype or model, so affinity-based clustering - and its benefits - cannot be directly realized. We describe a technique called "affinity propagation", which combines the advantages of both approaches. The method learns a mixture model of the data by recursively propagating affinity messages. We demonstrate affinity propagation on the problems of clustering image patches for image segmentation and learning mixtures of gene expression models from microarray data. We find that affinity propagation obtains better solutions than mixtures of Gaussians, the K-medoids algorithm, spectral clustering and hierarchical clustering, and is both able to find a pre-specified number of clusters and is able to automatically determine the number of clusters. Interestingly, affinity propagation can be viewed as belief propagation in a graphical model that accounts for pairwise training case likelihood functions and the identification of cluster centers.
Kernelized Infomax Clustering
Barber, David, Agakov, Felix V.
We propose a simple information-theoretic approach to soft clustering based on maximizing the mutual information I(x, y) between the unknown cluster labels y and the training patterns x with respect to parameters of specifically constrained encoding distributions. The constraints are chosen such that patterns are likely to be clustered similarly if they lie close to specific unknown vectors in the feature space. The method may be conveniently applied to learning the optimal affinity matrix, which corresponds to learning parameters of the kernelized encoder. The procedure does not require computations of eigenvalues of the Gram matrices, which makes it potentially attractive for clustering large data sets.
Size Regularized Cut for Data Clustering
Chen, Yixin, Zhang, Ya, Ji, Xiang
We present a novel spectral clustering method that enables users to incorporate priorknowledge of the size of clusters into the clustering process. The cost function, which is named size regularized cut (SRcut), is defined as the sum of the inter-cluster similarity and a regularization term measuring therelative size of two clusters. Finding a partition of the data set to minimize SRcut is proved to be NPcomplete. An approximation algorithm isproposed to solve a relaxed version of the optimization problem as an eigenvalue problem. Evaluations over different data sets demonstrate thatthe method is not sensitive to outliers and performs better than normalized cut.
Fusion of Similarity Data in Clustering
Lange, Tilman, Buhmann, Joachim M.
Fusing multiple information sources can yield significant benefits to successfully accomplishlearning tasks. Many studies have focussed on fusing information in supervised learning contexts. We present an approach to utilize multiple information sources in the form of similarity data for unsupervised learning. Based on similarity information, the clustering task is phrased as a nonnegative matrix factorization problem of a mixture ofsimilarity measurements. The tradeoff between the informativeness ofdata sources and the sparseness of their mixture is controlled by an entropy-based weighting mechanism. For the purpose of model selection, astability-based approach is employed to ensure the selection of the most self-consistent hypothesis. The experiments demonstrate the performance of the method on toy as well as real world data sets.
Generalization in Clustering with Unobserved Features
We argue that when objects are characterized by many attributes, clustering themon the basis of a relatively small random subset of these attributes can capture information on the unobserved attributes as well. Moreover, we show that under mild technical conditions, clustering the objects on the basis of such a random subset performs almost as well as clustering with the full attribute set. We prove a finite sample generalization theoremsfor this novel learning scheme that extends analogous results from the supervised learning setting. The scheme is demonstrated for collaborative filtering of users with movies rating as attributes.
A Probabilistic Approach for Optimizing Spectral Clustering
Jin, Rong, Kang, Feng, Ding, Chris H.
Spectral clustering enjoys its success in both data clustering and semisupervised learning.But, most spectral clustering algorithms cannot handle multi-class clustering problems directly. Additional strategies are needed to extend spectral clustering algorithms to multi-class clustering problems.Furthermore, most spectral clustering algorithms employ hard cluster membership, which is likely to be trapped by the local optimum. Inthis paper, we present a new spectral clustering algorithm, named "Soft Cut". It improves the normalized cut algorithm by introducing softmembership, and can be efficiently computed using a bound optimization algorithm. Our experiments with a variety of datasets have shown the promising performance of the proposed clustering algorithm.