Clustering
Multi-Task Multi-View Clustering for Non-Negative Data
Zhang, Xianchao (Dalian University of Technology) | Zhang, Xiaotong (Dalian University of Technology) | Liu, Han (Dalian University of Technology)
Multi-task clustering and multi-view clustering have severally found wide applications and received much attention in recent years. Nevertheless, there are many clustering problems that involve both multi-task clustering and multi-view clustering, i.e., the tasks are closely related and each task can be analyzed from multiple views. In this paper, for non-negative data (e.g., documents), we introduce a multi-task multi-view clustering (MTMVC) framework which integrates within-view-task clustering, multi-view relationship learning and multi-task relationship learning. We then propose a specific algorithm to optimize the MTMVC framework. Experimental results show the superiority of the proposed algorithm over either multi-task clustering algorithms or multi-view clustering algorithms for multi-task clustering of multi-view data.
Instance-Wise Weighted Nonnegative Matrix Factorization for Aggregating Partitions with Locally Reliable Clusters
Zheng, Xiaodong (Fudan University) | Zhu, Shanfeng (Fudan University) | Gao, Junning (Fudan University) | Mamitsuka, Hiroshi (Kyoto University)
We address an ensemble clustering problem, where reliable clusters are locally embedded in given multiple partitions. We propose a new nonnegative matrix factorization (NMF)-based method, in which locally reliable clusters are explicitly considered by using instance-wise weights over clusters. Our method factorizes the input cluster assignment matrix into two matrices H and W, which are optimized by iteratively 1) updating H and W while keeping the weight matrix constant and 2) updating the weight matrix while keeping H and W constant, alternatively. The weights in the second step were updated by solving a convex problem, which makes our algorithm significantly faster than existing NMF-based ensemble clustering methods. We empirically proved that our method outperformed a lot of cutting-edge ensemble clustering methods by using a variety of datasets.
Multi-view Self-Paced Learning for Clustering
Xu, Chang (Peking University) | Tao, Dacheng (University of Technology, Sydney) | Xu, Chao (Peking University)
Exploiting the information from multiple views can improve clustering accuracy. However, most ย existing multi-view clustering algorithms are non-convex and are thus prone to becoming stuck into bad local minima, especially when there are outliers and missing data. To overcome this problem, we present a new multi-view self-paced learning (MSPL) algorithm for clustering, that ย learns the multi-view model by not only progressing from 'easy' ย to 'complex' examples, but also from 'easy' ย to 'complex' views. Instead of binarily separating the examples or views into 'easy' and 'complex', we design a novel probabilistic smoothed weighting scheme. Employing multiple views for clustering and ย defining complexity ย across both examples and views are shown theoretically ย to be beneficial to optimal clustering. Experimental results on toy and real-world data demonstrate the efficacy of the proposed algorithm.
A Joint Optimization Framework of Sparse Coding and Discriminative Clustering
Wang, Zhangyang (University of Illinois at Urbana-Champaign) | Yang, Yingzhen (University of Illinois at Urbana-Champaign) | Chang, Shiyu (University of Illinois at Urbana-Champaign) | Li, Jinyan (University of Macau) | Fong, Simon (University of Macau) | Huang, Thomas S (University of Illinois at Urbana-Champaign)
Many clustering methods highly depend on extracted features. In this paper, we propose a joint optimization framework in terms of both feature extraction and discriminative clustering. We utilize graph regularized sparse codes as the features, and formulate sparse coding as the constraint for clustering. Two cost functions are developed based on entropy-minimization and maximum-margin clustering principles, respectively, as the objectives to be minimized. Solving such a bi-level optimization mutually reinforces both sparse coding and clustering steps. ย Experiments on several benchmark datasets verify remarkable performance improvements led by the proposed joint optimization.
Constrained Information-Theoretic Tripartite Graph Clustering to Identify Semantically Similar Relations
Wang, Chenguang (Peking University) | Song, Yangqiu (University of Illinois at Urbana-Champaign) | Roth, Dan (University of Illinois at Urbana-Champaign) | Wang, Chi (Microsoft Research) | Han, Jiawei (University of Illinois at Urbana-Champaign) | Ji, Heng (Rensselaer Polytechnic Institute) | Zhang, Ming (Peking University)
In knowledge bases or information extraction results, differently expressed relations can be semantically similar (e.g., (X, wrote, Y) and (X,โs written work, Y)). Therefore, grouping semantically similar relations into clusters would facilitate and improve many applications, including knowledge base completion, information extraction, information retrieval, and more. This paper formulates relation clustering as a constrained tripartite graph clustering problem, presents an efficient clustering algorithm and exhibits the advantage of the constrained framework. We introduce several ways that provide side information via must-link and cannot link constraints to improve the clustering results. Different from traditional semi-supervised learning approaches, we propose to use the similarity of relation expressions and the knowledge of entity types to automatically construct the constraints for the algorithm. We show improved relation clustering results on two datasets extracted from human annotated knowledge base (i.e., Freebase) and open information extraction results (i.e., ReVerb data).
Deep Linear Coding for Fast Graph Clustering
Shao, Ming (Northeastern University) | Li, Sheng (Northeastern University) | Ding, Zhengming (Northeastern University) | Fu, Yun (Northeastern University)
Clustering has been one of the most critical unsupervised learning techniques that has been widely applied in data mining problems. As one of its branches, graph clustering enjoys its popularity due to its appealing performance and strong theoretical supports. However, the eigen-decomposition problems involved are computationally expensive. In this paper, we propose a deep structure with a linear coder as the building block for fast graph clustering, called Deep Linear Coding (DLC). Different from conventional coding schemes, we jointly learn the feature transform function and discriminative codings, and guarantee that the learned codes are robust in spite of local distortions. In addition, we use the proposed linear coders as the building blocks to formulate a deep structure to further refine features in a layerwise fashion. Extensive experiments on clustering tasks demonstrate that our method performs well in terms of both time complexity and clustering accuracy. On a large-scale benchmark dataset (580K), our method runs 1500 times faster than the original spectral clustering.
Regularizing Flat Latent Variables with Hierarchical Structures
Lin, Rongcheng (University of North Carolina at Charlotte) | Li, Huayu (University of North Carolina at Charlotte) | Quan, Xiaojun (Institute for Infocomm Research) | Hong, Richang (Hefei University of Technology) | Wu, Zhiang (Nanjing University of Finance and Economics) | Ge, Yong (University of North Carolina at Charlotte)
In this paper, we propose a stratified topic model (STM). Instead of directly modeling and inferring flat topics or hierarchically structured topics, we use the stratified relationships in topic hierarchies to regularize the flat topics. The topic structures are captured by a hierarchical clustering method and play as constraints during the learning process. We propose two theoretically sound and practical inference methods to solve the model. Experimental results with two real world data sets and various evaluation metrics demonstrate the effectiveness of the proposed model.
A New Simplex Sparse Learning Model to Measure Data Similarity for Clustering
Huang, Jin (University of Texas at Arlington) | Nie, Feiping (University of Texas at Arlington) | Huang, Heng (University of Texas at Arlington)
The Laplacian matrix of a graph can be used in many areas of mathematical research and has a physical interpretation in various theories. However, there are a few open issues in the Laplacian graph construction: (i) Selecting the appropriate scale of analysis, (ii) Selecting the appropriate number of neighbors, (iii) Handling multiscale data, and, (iv) Dealing with noise and outliers. In this paper, we propose that the affinity between pairs of samples could be computed using sparse representation with proper constraints. This parameter free setting automatically produces the Laplacian graph, leads to significant reduction in computation cost and robustness to the outliers and noise. We further provide an efficient algorithm to solve the difficult optimization problem based on improvement of existing algorithms. To demonstrate our motivation, we conduct spectral clustering experiments with benchmark methods. Empirical experiments on 9 data sets demonstrate the effectiveness of our method.
Robust Multiple Kernel K-means Using L21-Norm
Du, Liang (Chinese Academy of Sciences and Shanxi University) | Zhou, Peng (Chinese Academy of Sciences) | Shi, Lei (Chinese Academy of Sciences) | Wang, Hanmo (Chinese Academy of Sciences) | Fan, Mingyu (Wenzhou University) | Wang, Wenjian (Shanxi University) | Shen, Yi-Dong (Chinese Academy of Sciences)
The k-means algorithm is one of the most often used method for data clustering. However, the standard k-means can only be applied in the original feature space. The kernel k-means, which extends k-means into the kernel space, can be used to capture the non-linear structure and identify arbitrarily shaped clusters. Since both the standard k-means and kernel k-means apply the squared error to measure the distances between data points and cluster centers, a few outliers will cause large errors and dominate the objection function. Besides, the performance of kernel method is largely determined by the choice of kernel. Unfortunately, the most suitable kernel for a particular task is often unknown in advance. In this paper, we first present a robust k-means using l 2,1 -norm in the feature space and then extend it to the kernel space. To recap the powerfulness of kernel methods, we further propose a novel robust multiple kernel k-means (RMKKM) algorithm that simultaneously finds the best clustering label, the cluster membership and the optimal combination of multiple kernels. An alternating iterative schema is developed to find the optimal value. Extensive experiments well demonstrate the effectiveness of the proposed algorithms.
Intersecting Manifolds: Detection, Segmentation, and Labeling
Deutsch, Shay (University of Southern California) | Medioni, Gerard Guy (University of Southern California)
Solving multi-manifolds clustering problems that include delineating and resolving multiple intersections is a very challenging problem. In this paper we propose a novel procedure for clustering intersecting multi-manifolds and delineating junctions in high dimensional spaces. We propose to explicitly and directly resolve ambiguities near the intersections by using 2 properties: One is the position of the data points in the vicinity of the detected intersection; the other is the reliable estimation of the tangent spaces away from the intersections. We experiment with our method on a wide range of geometrically complex settings of convoluted intersecting manifolds, on which we demon- strate higher clustering performance than the state of the art. This includes tackling challenging geometric structures such as when the tangent spaces at the intersections points are not orthogonal.