Goto

Collaborating Authors

 Clustering


Multiple Kernel k-Means with Incomplete Kernels

AAAI Conferences

Multiple kernel clustering (MKC) algorithms optimally combine a group of pre-specified base kernels to improve clustering performance. However, existing MKC algorithms cannot efficiently address the situation where some rows and columns of base kernels are absent. This paper proposes a simple while effective algorithm to address this issue. Different from existing approaches where incomplete kernels are firstly imputed and a standard MKC algorithm is applied to the imputed kernels, our algorithm integrates imputation and clustering into a unified learning procedure. Specifically, we perform multiple kernel clustering directly with the presence of incomplete kernels, which are treated as auxiliary variables to be jointly optimized. Our algorithm does not require that there be at least one complete base kernel over all the samples. Also, it adaptively imputes incomplete kernels and combines them to best serve clustering. A three-step iterative algorithm with proved convergence is designed to solve the resultant optimization problem. Extensive experiments are conducted on four benchmark data sets to compare the proposed algorithm with existing imputation-based methods. Our algorithm consistently achieves superior performance and the improvement becomes more significant with increasing missing ratio, verifying the effectiveness and advantages of the proposed joint imputation and clustering.


Balanced Clustering with Least Square Regression

AAAI Conferences

Clustering is a fundamental research topic in data mining. A balanced clustering result is often required in a variety of applications. Many existing clustering algorithms have good clustering performances, yet fail in producing balanced clusters. In this paper, we propose a novel and simple method for clustering, referred to as the Balanced Clustering with Least Square regression (BCLS), to minimize the least square linear regression, with a balance constraint to regularize the clustering model. In BCLS, the linear regression is applied to estimate the class-specific hyperplanes that partition each class of data from others, thus guiding the clustering of the data points into different clusters. A balance constraint is utilized to regularize the clustering, by minimizing which can help produce balanced clusters. In addition, we apply the method of augmented Lagrange multipliers (ALM) to help optimize the objective model. The experiments on seven real-world benchmarks demonstrate that our approach not only produces good clustering performance but also guarantees a balanced clustering result.


Sparse Subspace Clustering by Learning Approximation โ„“0 Codes

AAAI Conferences

Subspace clustering has been widely applied to detect meaningful clusters in high-dimensional data spaces. A main challenge in subspace clustering is to quickly calculate a "good" affinity matrix. โ„“ 0 , โ„“ 1 , โ„“ 2 or nuclear norm regularization is used to construct the affinity matrix in many subspace clustering methods because of their theoretical guarantees and empirical success. However, they suffer from the following problems: (1) โ„“ 2 and nuclear norm regularization require very strong assumptions to guarantee a subspace-preserving affinity; (2) although โ„“ 1 regularization can be guaranteed to give a subspace-preserving affinity under certain conditions, it needs more time to solve a large-scale convex optimization problem; (3) โ„“ 0 regularization can yield a tradeoff between computationally efficient and subspace-preserving affinity by using the orthogonal matching pursuit (OMP) algorithm, but this still takes more time to search the solution in OMP when the number of data points is large. In order to overcome these problems, we first propose a learned OMP (LOMP) algorithm to learn a single hidden neural network (SHNN) to fast approximate the โ„“ 0 code. We then exploit a sparse subspace clustering method based on โ„“ 0 code which is fast computed by SHNN. Two sufficient conditions are presented to guarantee that our method can give a subspace-preserving affinity. Experiments on handwritten digit and face clustering show that our method not only quickly computes the โ„“ 0 code, but also outperforms the relevant subspace clustering methods in clustering results. In particular, our method achieves the state-of-the-art clustering accuracy (94.32%) on MNIST.


Twin Learning for Similarity and Clustering: A Unified Kernel Approach

AAAI Conferences

Many similarity-based clustering methods work in two separate steps including similarity matrix computation and subsequent spectral clustering. However similarity measurement is challenging because it is usually impacted by many factors, e.g., the choice of similarity metric, neighborhood size, scale of data, noise and outliers. Thus the learned similarity matrix is often not suitable, let alone optimal, for the subsequent clustering. In addition, nonlinear similarity often exists in many real world data which, however, has not been effectively considered by most existing methods. To tackle these two challenges, we propose a model to simultaneously learn cluster indicator matrix and similarity information in kernel spaces in a principled way. We show theoretical relationships to kernel k-means, k-means, and spectral clustering methods. Then, to address the practical issue of how to select the most suitable kernel for a particular clustering task, we further extend our model with a multiple kernel learning ability. With this joint model, we can automatically accomplish three subtasks of finding the best cluster indicator matrix, the most accurate similarity relations and the optimal combination of multiple kernels. By leveraging the interactions between these three subtasks in a joint framework, each subtask can be iteratively boosted by using the results of the others towards an overall optimal solution. Extensive experiments are performed to demonstrate the effectiveness of our method.


Scalable Algorithm for Higher-Order Co-Clustering via Random Sampling

AAAI Conferences

We propose a scalable and efficient algorithm for coclustering a higher-order tensor. Viewing tensors with hypergraphs, we propose formulating the co-clustering of a tensor as a problem of partitioning the corresponding hypergraph. Our algorithm is based on the random sampling technique, which has been successfully applied to graph cut problems. We extend a random sampling algorithm for the graph multiwaycut problem to hypergraphs, and design a co-clustering algorithm based on it. Each iteration of our algorithm runs in polynomial on the size of hypergraphs, and thus it performs well even for higher-order tensors, which are difficult to deal with for state-of-the-art algorithm.


Bilateral k-Means Algorithm for Fast Co-Clustering

AAAI Conferences

With the development of the information technology, the amount of data, e.g. text, image and video, has been increased rapidly. Efficiently clustering those large scale data sets is a challenge. To address this problem, this paper proposes a novel co-clustering method named bilateral k-means algorithm (BKM) for fast co-clustering. Different from traditional k-means algorithms, the proposed method has two indicator matrices P and Q and a diagonal matrix S to be solved, which represent the cluster memberships of samples and features, and the co-cluster centres, respectively. Therefore, it could implement different clustering tasks on the samples and features simultaneously. We also introduce an effective approach to solve the proposed method, which involves less multiplication. The computational complexity is analyzed. Extensive experiments on various types of data sets are conducted. Compared with the state-of-the-art clustering methods, the proposed BKM not only has faster computational speed, but also achieves promising clustering results.


Local Centroids Structured Non-Negative Matrix Factorization

AAAI Conferences

Non-negative Matrix Factorization (NMF) has attracted much attention and been widely used in real-world applications. As a clustering method, it fails to handle the case where data points lie in a complicated geometry structure. Existing methods adopt single global centroid for each cluster, failing to capture the manifold structure. In this paper, we propose a novel local centroids structured NMF to address this drawback. Instead of using single centroid for each cluster, we introduce multiple local centroids for individual cluster such that the manifold structure can be captured by the local centroids. Such a novel NMF method can improve the clustering performance effectively. Furthermore, a novel bipartite graph is incorporated to obtain the clustering indicator directly without any post process. Experiments on both toy datasets and real-world datasets have verified the effectiveness of the proposed method.


Structure Regularized Unsupervised Discriminant Feature Analysis

AAAI Conferences

Feature selection is an important technique in machine learning research. An effective and robust feature selection method is desired to simultaneously identify the informative features and eliminate the noisy ones of data. In this paper, we consider the unsupervised feature selection problem which is particularly difficult as there is not any class labels that would guide the search for relevant features. To solve this, we propose a novel algorithmic framework which performs unsupervised feature selection. Firstly, the proposed framework implements structure learning, where the data structures (including intrinsic distribution structure and the data segment) are found via a combination of the alternative optimization and clustering. Then, both the intrinsic data structure and data segmentation are formulated as regularization terms for discriminant feature selection. The results of the feature selection also affect the structure learning step in the following iterations. By leveraging the interactions between structure learning and feature selection, we are able to capture more accurate structure of data and select more informative features. Clustering and classification experiments on real world image data sets demonstrate the effectiveness of our method.


Label Efficient Learning by Exploiting Multi-Class Output Codes

AAAI Conferences

We present a new perspective on the popular multi-class algorithmic techniques of one-vs-all and error correcting output codes. Rather than studying the behavior of these techniques for supervised learning, we establish a connection between the success of these methods and the existence of label-efficient learning procedures. We show that in both the realizable and agnostic cases, if output codes are successful at learning from labeled data, they implicitly assume structure on how the classes are related. By making that structure explicit, we design learning algorithms to recover the classes with low label complexity. We provide results for the commonly studied cases of one-vs-all learning and when the codewords of the classes are well separated. We additionally consider the more challenging case where the codewords are not well separated, but satisfy a boundary features condition that captures the natural intuition that every bit of the codewords should be significant.


A Framework for Minimal Clustering Modification via Constraint Programming

AAAI Conferences

Consider the situation where your favorite clustering algorithm applied to a data set returns a good clustering but there are a few undesirable properties. One adhoc way to fix this is to re-run the clustering algorithm and hope to find a better variation. Instead, we propose to not run the algorithm again but minimally modify the existing clustering to remove the undesirable properties. We formulate the minimal clustering modification problem where we are given an initial clustering produced from any algorithm. The clustering is then modified to: i) remove the undesirable properties and ii) be minimally different to the given clustering. We show the underlying feasibility sub-problem can be intractable and demonstrate the flexibility of our constraint programming formulation. We empirically validate its usefulness through experiments on social network and medical imaging data sets.