Pan, Yan
Robust Multi-View Spectral Clustering via Low-Rank and Sparse Decomposition
Xia, Rongkai (Sun Yat-sen University) | Pan, Yan (Sun Yat-sen University) | Du, Lei (Sun Yat-sen University) | Yin, Jian (Sun Yat-sen University)
Multi-view clustering, which seeks a partition of the data inmultiple views that often provide complementary information to eachother, has received considerable attention in recent years. In reallife clustering problems, the data in each view may haveconsiderable noise. However, existing clustering methods blindlycombine the information from multi-view data with possiblyconsiderable noise, which often degrades their performance. In thispaper, we propose a novel Markov chain method for RobustMulti-view Spectral Clustering (RMSC). Our method has a flavor oflow-rank and sparse decomposition, where we firstly construct atransition probability matrix from each single view, and then usethese matrices to recover a shared low-rank transition probabilitymatrix as a crucial input to the standard Markov chain methodfor clustering. The optimization problem of RMSC has a low-rankconstraint on the transition probability matrix, and simultaneouslya probabilistic simplex constraint on each of its rows. To solvethis challenging optimization problem, we propose an optimization procedurebased on the Augmented Lagrangian Multiplier scheme. Experimentalresults on various real world datasets show that theproposed method has superior performance over severalstate-of-the-art methods for multi-view clustering.
Supervised Hashing for Image Retrieval via Image Representation Learning
Xia, Rongkai (Sun Yat-sen University) | Pan, Yan (Sun Yat-sen University) | Lai, Hanjiang (Sun Yat-sen University) | Liu, Cong (Sun Yat-sen University) | Yan, Shuicheng (National University of Singapore)
Hashing is a popular approximate nearest neighbor search approach for large-scale image retrieval. Supervised hashing, which incorporates similarity/dissimilarity information on entity pairs to improve the quality of hashing function learning, has recently received increasing attention. However, in the existing supervised hashing methods for images, an input image is usually encoded by a vector of hand-crafted visual features. Such hand-crafted feature vectors do not necessarily preserve the accurate semantic similarities of images pairs, which may often degrade the performance of hashing function learning. In this paper, we propose a supervised hashing method for image retrieval, in which we automatically learn a good image representation tailored to hashing as well as a set of hash functions. The proposed method has two stages. In the first stage, given the pairwise similarity matrix $S$ over training images, we propose a scalable coordinate descent method to decompose $S$ into a product of $HH^T$ where $H$ is a matrix with each of its rows being the approximate hash code associated to a training image. In the second stage, we propose to simultaneously learn a good feature representation for the input images as well as a set of hash functions, via a deep convolutional network tailored to the learned hash codes in $H$ and optionally the discrete class labels of the images. Extensive empirical evaluations on three benchmark datasets with different kinds of images show that the proposed method has superior performance gains over several state-of-the-art supervised and unsupervised hashing methods.
Rank Aggregation via Low-Rank and Structured-Sparse Decomposition
Pan, Yan (Sun Yat-sen University) | Lai, Hanjiang (Sun Yat-sen University) | Liu, Cong (Sun Yat-sen University) | Tang, Yong (South Normal University of China) | Yan, Shuicheng (National University of Singapore)
Rank aggregation, which combines multiple individual rank lists toobtain a better one, is a fundamental technique in various applications such as meta-search and recommendation systems. Most existing rank aggregation methods blindly combine multiple rank lists with possibly considerable noises, which often degrades their performances. In this paper, we propose a new model for robust rank aggregation (RRA) via matrix learning, which recovers a latent rank list from the possibly incomplete and noisy input rank lists. In our model, we construct a pairwise comparison matrix to encode the order information in each input rank list. Based on our observations, each comparison matrix can be naturally decomposed into a shared low-rank matrix, combined with a deviation error matrix which is the sum of a column-sparse matrix and a row-sparse one. The latent rank list can be easily extracted from the learned low-rank matrix. The optimization formulation of RRA has an element-wise multiplication operator to handle missing values, a symmetric constraint on the noise structure, and a factorization trick to restrict the maximum rank of the low-rank matrix. To solve this challenging optimization problem, we propose a novel procedure based on the Augmented Lagrangian Multiplier scheme. We conduct extensive experiments on meta-search and collaborative filtering benchmark datasets. The results show that the proposed RRA has superior performance gain over several state-of-the-art algorithms for rank aggregation.