Genre
Non-Metric Label Propagation
Zhang, Yin (Nanjing University) | Zhou, Zhi-Hua (Nanjing University)
In many applications non-metric distances are better than metricย distances in reflecting the perceptual distances of human beings.ย Previous studies on non-metric distances mainly focused onย supervised setting and did not consider the usefulness of unlabeledย data. In this paper, we present probably the first study of labelย propagation on graphs induced from non-metric distances. Theย challenge here lies in the fact that the triangular inequality doesย not hold for non-metric distances and therefore, a directย application of existing label propagation methods will lead toย inconsistency and conflict. We show that by applying spectrumย transformation, non-metric distances can be converted into metricย ones, and thus label propagation can be executed. Such methods,ย however, suffer from the change of original semantic relations. As aย main result of this paper, we prove that any non-metric distanceย matrix can be decomposed into two metric distance matricesย containing different information of the data. Based on thisย recognition, our proposed NMLP method derives two graphsย from the original non-metric distance and performs a joint labelย propagation on the joint graph. Experiments validate theย effectiveness of the proposed NMLP method.
M 3 IC: Maximum Margin Multiple Instance Clustering
Zhang, Dan (Purdue University, West Lafayette) | Wang, Fei (Florida International University) | Si, Luo (Purdue University, West Lafayette) | Li, Tao (Florida International University)
Clustering, classification, and regression, are three major research topics in machine learning. So far, much work has been conducted in solving multiple instance classification and multiple instance regression problems, where supervised training patterns are given as bags and each bag consists of some instances. But the research on unsupervised multiple instance clustering is still limited . This paper formulates a novel Maximum Margin Multiple Instance Clustering problem for the multiple instance clustering task. To avoid solving a non-convexย optimization problem directly, M 3 IC is further relaxed, which enables an efficient optimization solution with a combination of Constrained Concave-Convex Procedure CCCP) and the Cutting Plane method. Furthermore, this paper analyzes some important properties of the proposed method and the relationship between the proposed method and some other related ones. An extensive set of empirical results demonstrate the advantages of the proposed method against existing research for both effectiveness and efficiency.
Fast Active Tabu Search and its Application to Image Retrieval
Zhang, Chao (Tongji University) | Li, Hongyu (Tongji University) | Guo, Qiyong (Fudan University) | Jia, Jinyuan (Tongji University) | Shen, I-Fan (Fudan University)
This paper proposes a novel framework for image retrieval. The retrieval is treated as searching for an ordered cycle in an image database. The optimal cycle can be found by minimizing the geometric manifold entropy of images. The minimization is solved by the proposed method, fast active tabu search. Experimental results demonstrate the framework for image retrieval is feasible and quite promising.
Robust Distance Metric Learning with Auxiliary Knowledge
Zha, Zheng-Jun (University of Science and Technology of China) | Mei, Tao (Microsoft Research Asia) | Wang, Meng (Microsoft Research Asia) | Wang, Zengfu (University of Science and Technology of China) | Hua, Xian-Sheng (Microsoft Research Asia)
Most of the existing metric learning methods are accomplished byย exploiting pairwise constraints over the labeled data and frequentlyย suffer from the insufficiency of training examples. ย To learn aย robust distance metric from few labeled examples, prior knowledgeย from unlabeled examples as well as the metrics previously derivedย from auxiliary data sets can be useful. ย In this paper, we proposeย to leverage such auxiliary knowledge to assist distance metricย learning, which is formulated following the regularized lossย minimization principle. ย Two algorithms are derived on the basis ofย manifold regularization and log-determinant divergenceย regularization technique, respectively, which can simultaneouslyย exploit label information (i.e., the pairwise constraints overย labeled data), unlabeled examples, and the metrics derived fromย auxiliary data sets. ย The proposed methods directly manipulate the auxiliary metrics and require no raw examples from the auxiliaryย data sets, which make them efficient and flexible. ย We conductย extensive evaluations to compare our approaches with a number ofย competing approaches on face recognition task. ย The experimentalย results show that our approaches can derive reliable distanceย metrics from limited training examples and thus are superior inย terms of accuracy and labeling efforts.
Transfer Learning using Task-Level Features with Application to Information Retrieval
Yan, Rong (IBM Research) | Zhang, Jian (Purdue University)
We propose a probabilistic transfer learning model that uses task-level features to control the task mixture selection in a hierarchical Bayesian model. These task-level features, although rarely used in existing approaches, can provide additional information to model complex task distributions and allow effective transfer to new tasks especially when only limited number of data are available. To estimate the model parameters, we develop an empirical Bayes method based on variational approximation techniques. Our experiments on information retrieval show that the proposed model achieves significantly better performance compared with other transfer learning methods.
Early Prediction on Time Series: A Nearest Neighbor Approach
Xing, Zhengzheng (Simon Fraser Univeristy) | Pei, Jian (Simon Fraser University) | Yu, Philip S. (University of Illinois at Chicago)
In this paper, we formulate the problem of early classification of time series data, which is important in some time-sensitive applications such as health-informatics. We introduce a novel concept of MPL (Minimum Prediction Length) and develop ECTS (Early Classification on Time Series), an effective 1-nearest neighbor classification method. ECTS makes early predictions and at the same time retains the accuracy comparable to that of a 1NN classifier using the full-length time series. Our empirical study using benchmark time series data sets shows that ECTS works well on the real data sets where 1NN classification is effective.
Generalized Cluster Aggregation
Wang, Fei (Florida International University) | Wang, Xin (Florida International University) | Li, Tao (Florida International University)
Clustering aggregation has emerged as an important extension of the classical clustering problem. It refers to the situation in which a number of different (input) clusterings have been obtained for a particular data set and it is desired to aggregate those clustering results to get a better clustering solution. In this paper, we propose a unified framework to solve the clustering aggregation problem, where the aggregated clustering result is obtained by minimizing the (weighted) sum of the Bregman divergence between it and all the input clusterings. Moreover, under our algorithm framework, we also propose a novel cluster aggregation problem where some must-link and cannot-link constraints are given in addition to the input clusterings. Finally the experimental results on some real world data sets are presented to show the effectiveness of our method.
Multiclass Probabilistic Kernel Discriminant Analysis
Zhao, Zheng (Arizona State Univeristy) | Sun, Liang (Arizona State Univeristy) | Yu, Shipeng (CAD and Knowledge Solutions, Siemens Medical Solutions) | Liu, Huan (Arizona State Univeristy) | Ye, Jieping (Arizona State Univeristy)
Kernel discriminant analysis (KDA) is an effective approach for supervised nonlinear dimensionality reduction. Probabilistic models can be used with KDA to improve its robustness. However, the state of the art of such models could only handle binary class problems, which confines their application in many real world problems. To overcome this limitation, we propose a novel nonparametric probabilistic model based on Gaussian Process for KDA to handle multiclass problems. The model provides a novel Bayesian interpretation for KDA, which allows its parameters to be automatically tuned through the optimization of the marginal loglikelihood of the data. Empirical study demonstrates the efficacy of the proposed model.
Toward Unsupervised Activity Discovery Using Multi Dimensional Motif Detection in Time Series
Vahdatpour, Alireza (University of California, Los Angeles) | Amini, Navid (University of California, Los Angeles) | Sarrafzadeh, Majid (University of California, Los Angeles)
This paper addresses the problem of activity and event discovery in multi dimensional time series data by proposing a novel method for locating multi dimensional motifs in time series. While recent work has been done in finding single dimensional and multi dimensional motifs in time series, we address motifs in general case, where the elements of multi dimensional motifs have temporal, length, and frequency variations. The proposed method is validated by synthetic data, and empirical evaluation has been done on several wearable systems that are used by real subjects.
On the Equivalence Between Canonical Correlation Analysis and Orthonormalized Partial Least Squares
Sun, Liang (Arizona State University) | Ji, Shuiwang (Arizona State University) | Yu, Shipeng (Siemens Medical Solutions USA, Inc.) | Ye, Jieping (Arizona State University)
Canonical correlation analysis (CCA) and partial least squares (PLS) are well-known techniques for feature extraction from two sets of multi-dimensional variables. The fundamental difference between CCA and PLS is that CCA maximizes the correlation while PLS maximizes the covariance. Although both CCA and PLS have been applied successfully in various applications, the intrinsic relationship between them remains unclear. In this paper, we attempt to address this issue by showing the equivalence relationship between CCA and orthonormalized partial least squares (OPLS), a variant of PLS. We further extend the equivalence relationship to the case when regularization is employed for both sets of variables. In addition, we show that the CCA projection for one set of variables is independent of the regularization on the other set of variables. We have performed experimental studies using both synthetic and real data sets and our results confirm the established equivalence relationship. The presented analysis provides novel insights into the connection between these two existing algorithms as well as the effect of the regularization.