Clustering
Instance-Wise Weighted Nonnegative Matrix Factorization for Aggregating Partitions with Locally Reliable Clusters
Zheng, Xiaodong (Fudan University) | Zhu, Shanfeng (Fudan University) | Gao, Junning (Fudan University) | Mamitsuka, Hiroshi (Kyoto University)
We address an ensemble clustering problem, where reliable clusters are locally embedded in given multiple partitions. We propose a new nonnegative matrix factorization (NMF)-based method, in which locally reliable clusters are explicitly considered by using instance-wise weights over clusters. Our method factorizes the input cluster assignment matrix into two matrices H and W, which are optimized by iteratively 1) updating H and W while keeping the weight matrix constant and 2) updating the weight matrix while keeping H and W constant, alternatively. The weights in the second step were updated by solving a convex problem, which makes our algorithm significantly faster than existing NMF-based ensemble clustering methods. We empirically proved that our method outperformed a lot of cutting-edge ensemble clustering methods by using a variety of datasets.
Instance-Wise Weighted Nonnegative Matrix Factorization for Aggregating Partitions with Locally Reliable Clusters
Zheng, Xiaodong (Fudan University) | Zhu, Shanfeng (Fudan University) | Gao, Junning (Fudan University) | Mamitsuka, Hiroshi (Kyoto University)
We address an ensemble clustering problem, where reliable clusters are locally embedded in given multiple partitions. We propose a new nonnegative matrix factorization (NMF)-based method, in which locally reliable clusters are explicitly considered by using instance-wise weights over clusters. Our method factorizes the input cluster assignment matrix into two matrices H and W, which are optimized by iteratively 1) updating H and W while keeping the weight matrix constant and 2) updating the weight matrix while keeping H and W constant, alternatively. The weights in the second step were updated by solving a convex problem, which makes our algorithm significantly faster than existing NMF-based ensemble clustering methods. We empirically proved that our method outperformed a lot of cutting-edge ensemble clustering methods by using a variety of datasets.
Instance-Wise Weighted Nonnegative Matrix Factorization for Aggregating Partitions with Locally Reliable Clusters
Zheng, Xiaodong (Fudan University) | Zhu, Shanfeng (Fudan University) | Gao, Junning (Fudan University) | Mamitsuka, Hiroshi (Kyoto University)
We address an ensemble clustering problem, where reliable clusters are locally embedded in given multiple partitions. We propose a new nonnegative matrix factorization (NMF)-based method, in which locally reliable clusters are explicitly considered by using instance-wise weights over clusters. Our method factorizes the input cluster assignment matrix into two matrices H and W, which are optimized by iteratively 1) updating H and W while keeping the weight matrix constant and 2) updating the weight matrix while keeping H and W constant, alternatively. The weights in the second step were updated by solving a convex problem, which makes our algorithm significantly faster than existing NMF-based ensemble clustering methods. We empirically proved that our method outperformed a lot of cutting-edge ensemble clustering methods by using a variety of datasets.
Instance-Wise Weighted Nonnegative Matrix Factorization for Aggregating Partitions with Locally Reliable Clusters
Zheng, Xiaodong (Fudan University) | Zhu, Shanfeng (Fudan University) | Gao, Junning (Fudan University) | Mamitsuka, Hiroshi (Kyoto University)
We address an ensemble clustering problem, where reliable clusters are locally embedded in given multiple partitions. We propose a new nonnegative matrix factorization (NMF)-based method, in which locally reliable clusters are explicitly considered by using instance-wise weights over clusters. Our method factorizes the input cluster assignment matrix into two matrices H and W, which are optimized by iteratively 1) updating H and W while keeping the weight matrix constant and 2) updating the weight matrix while keeping H and W constant, alternatively. The weights in the second step were updated by solving a convex problem, which makes our algorithm significantly faster than existing NMF-based ensemble clustering methods. We empirically proved that our method outperformed a lot of cutting-edge ensemble clustering methods by using a variety of datasets.
Instance-Wise Weighted Nonnegative Matrix Factorization for Aggregating Partitions with Locally Reliable Clusters
Zheng, Xiaodong (Fudan University) | Zhu, Shanfeng (Fudan University) | Gao, Junning (Fudan University) | Mamitsuka, Hiroshi (Kyoto University)
We address an ensemble clustering problem, where reliable clusters are locally embedded in given multiple partitions. We propose a new nonnegative matrix factorization (NMF)-based method, in which locally reliable clusters are explicitly considered by using instance-wise weights over clusters. Our method factorizes the input cluster assignment matrix into two matrices H and W, which are optimized by iteratively 1) updating H and W while keeping the weight matrix constant and 2) updating the weight matrix while keeping H and W constant, alternatively. The weights in the second step were updated by solving a convex problem, which makes our algorithm significantly faster than existing NMF-based ensemble clustering methods. We empirically proved that our method outperformed a lot of cutting-edge ensemble clustering methods by using a variety of datasets.
Instance-Wise Weighted Nonnegative Matrix Factorization for Aggregating Partitions with Locally Reliable Clusters
Zheng, Xiaodong (Fudan University) | Zhu, Shanfeng (Fudan University) | Gao, Junning (Fudan University) | Mamitsuka, Hiroshi (Kyoto University)
We address an ensemble clustering problem, where reliable clusters are locally embedded in given multiple partitions. We propose a new nonnegative matrix factorization (NMF)-based method, in which locally reliable clusters are explicitly considered by using instance-wise weights over clusters. Our method factorizes the input cluster assignment matrix into two matrices H and W, which are optimized by iteratively 1) updating H and W while keeping the weight matrix constant and 2) updating the weight matrix while keeping H and W constant, alternatively. The weights in the second step were updated by solving a convex problem, which makes our algorithm significantly faster than existing NMF-based ensemble clustering methods. We empirically proved that our method outperformed a lot of cutting-edge ensemble clustering methods by using a variety of datasets.
Instance-Wise Weighted Nonnegative Matrix Factorization for Aggregating Partitions with Locally Reliable Clusters
Zheng, Xiaodong (Fudan University) | Zhu, Shanfeng (Fudan University) | Gao, Junning (Fudan University) | Mamitsuka, Hiroshi (Kyoto University)
We address an ensemble clustering problem, where reliable clusters are locally embedded in given multiple partitions. We propose a new nonnegative matrix factorization (NMF)-based method, in which locally reliable clusters are explicitly considered by using instance-wise weights over clusters. Our method factorizes the input cluster assignment matrix into two matrices H and W, which are optimized by iteratively 1) updating H and W while keeping the weight matrix constant and 2) updating the weight matrix while keeping H and W constant, alternatively. The weights in the second step were updated by solving a convex problem, which makes our algorithm significantly faster than existing NMF-based ensemble clustering methods. We empirically proved that our method outperformed a lot of cutting-edge ensemble clustering methods by using a variety of datasets.
Using Social Media to Enhance Emergency Situation Awareness: Extended Abstract
Yin, Jie (CSIRO) | Karimi, Sarvnaz (CSIRO) | Lampert, Andrew (Palantir Technologies) | Cameron, Mark (CSIRO) | Robinson, Bella (CSIRO) | Power, Robert (CSIRO)
Social media platforms, such as Twitter, offer a rich source of real-time information about real-world events, particularly during mass emergencies. Sifting valuable information from social media provides useful insight into time-critical situations for emergency officers to understand the impact of hazards and act on emergency responses in a timely manner. This work focuses on analyzing Twitter messages generated during natural disasters, and shows how natural language processing and data mining techniques can be utilized to extract situation awareness information from Twitter. We present key relevant approaches that we have investigated including burst detection, tweet filtering and classification, online clustering, and geotagging.
Learning a Robust Consensus Matrix for Clustering Ensemble via Kullback-Leibler Divergence Minimization
Zhou, Peng (Chinese Academy of Sciences) | Du, Liang (Chinese Academy of Sciences) | Wang, Hanmo (Chinese Academy of Sciences) | Shi, Lei (Chinese Academy of Sciences) | Shen, Yi-Dong (Chinese Academy of Sciences)
Clustering ensemble has emerged as an important extension of the classical clustering problem. It provides a framework for combining multiple base clusterings of a data set to generate a final consensus result. Most existing clustering methods simply combine clustering results without taking into account the noises, which may degrade the clustering performance. In this paper, we propose a novel robust clustering ensemble method. To improve the robustness, we capture the sparse and symmetric errors and integrate them into our robust and consensus framework to learn a low-rank matrix. Since the optimization of the objective function is difficult to solve, we develop a block coordinate descent algorithm which is theoretically guaranteed to converge. Experimental results on real world data sets demonstrate the effectiveness of our method.
Recovery of Corrupted Multiple Kernels for Clustering
Zhou, Peng (Chinese Academy of Sciences) | Du, Liang (Chinese Academy of Sciences) | Shi, Lei (Chinese Academy of Sciences) | Wang, Hanmo (Chinese Academy of Sciences) | Shen, Yi-Dong (Chinese Academy of Sciences)
Kernel-based methods, such as kernel k-means and kernel PCA, have been widely used in machine learning tasks. The performance of these methods critically depends on the selection of kernel functions; however, the challenge is that we usually do not know what kind of kernels is suitable for the given data and task in advance; this leads to research on multiple kernel learning, i.e. we learn a consensus kernel from multiple candidate kernels. Existing multiple kernel learning methods have difficulty in dealing with noises. In this paper, we propose a novel method for learning a robust yet low-rank kernel for clustering tasks. We observe that the noises of each kernel have specific structures, so we can make full use of them to clean multiple input kernels and then aggregate them into a robust, low-rank consensus kernel. The underlying optimization problem is hard to solve and we will show that it can be solved via alternating minimization, whose convergence is theoretically guaranteed. Experimental results on several benchmark data sets further demonstrate the effectiveness of our method.