Goto

Collaborating Authors

 Statistical Learning


Knowledge Transfer on Hybrid Graph

AAAI Conferences

In machine learning problems, labeled data are often in short supply. One of the feasible solution for this problem is transfer learning. It can make use of the labeled data from other domain to discriminate those unlabeled data in the target domain. In this paper, we propose a transfer learning framework based on similarity matrix approximation to tackle such problems. Two practical algorithms are proposed, which are the label propagation and the similarity propagation. In these methods, we build a hybrid graph based on all available data. Then the information is transferred cross domains through alternatively constructing the similarity matrix for different part of the graph. Among all related methods, similarity propagation approach can make maximum use of all available similarity information across domains. This leads to more efficient transfer and better learning result. The experiment on real world text mining applications demonstrates the promise and effectiveness of our algorithms.


Generalized Cluster Aggregation

AAAI Conferences

Clustering aggregation has emerged as an important extension of the classical clustering problem. It refers to the situation in which a number of different (input) clusterings have been obtained for a particular data set and it is desired to aggregate those clustering results to get a better clustering solution. In this paper, we propose a unified framework to solve the clustering aggregation problem, where the aggregated clustering result is obtained by minimizing the (weighted) sum of the Bregman divergence between it and all the input clusterings. Moreover, under our algorithm framework, we also propose a novel cluster aggregation problem where some must-link and cannot-link constraints are given in addition to the input clusterings. Finally the experimental results on some real world data sets are presented to show the effectiveness of our method.


Manifold Alignment without Correspondence

AAAI Conferences

Manifold alignment has been found to be useful in many areas of machine learning and data mining. In this paper we introduce a novel manifold alignment approach, which differs from semi-supervised alignment and Procrustes alignment in that it does not require predetermining correspondences. Our approach learns a projection that maps data instances (from two different spaces) to a lower dimensional space simultaneously matching the local geometry and preserving the neighborhood relationship within each set. This approach also builds connections between spaces defined by different features and makes direct knowledge transfer possible. The performance of our algorithm is demonstrated and validated in a series of carefully designed experiments in information retrieval and bioinformatics.


Multiclass Probabilistic Kernel Discriminant Analysis

AAAI Conferences

Kernel discriminant analysis (KDA) is an effective approach for supervised nonlinear dimensionality reduction. Probabilistic models can be used with KDA to improve its robustness. However, the state of the art of such models could only handle binary class problems, which confines their application in many real world problems. To overcome this limitation, we propose a novel nonparametric probabilistic model based on Gaussian Process for KDA to handle multiclass problems. The model provides a novel Bayesian interpretation for KDA, which allows its parameters to be automatically tuned through the optimization of the marginal loglikelihood of the data. Empirical study demonstrates the efficacy of the proposed model.


Toward Unsupervised Activity Discovery Using Multi Dimensional Motif Detection in Time Series

AAAI Conferences

This paper addresses the problem of activity and event discovery in multi dimensional time series data by proposing a novel method for locating multi dimensional motifs in time series. While recent work has been done in finding single dimensional and multi dimensional motifs in time series, we address motifs in general case, where the elements of multi dimensional motifs have temporal, length, and frequency variations. The proposed method is validated by synthetic data, and empirical evaluation has been done on several wearable systems that are used by real subjects.


Latent Variable Perceptron Algorithm for Structured Classification

AAAI Conferences

We propose a perceptron-style algorithm for fast discriminative training of structured latent variable model. This method extends the perceptron algorithm for the learning with latent dependencies, as an alternative to existing probabilistic latent variable models. It relies on Viterbi decoding over latent variables, combined with simple additive updates. Its training cost is significantly lower than that of probabilistic latent variable models, while it gives comparable or even superior classification accuracy on our tasks. Experiments on natural language processing problems demonstrate that its results are among those good reports on corresponding data sets.


On the Equivalence Between Canonical Correlation Analysis and Orthonormalized Partial Least Squares

AAAI Conferences

Canonical correlation analysis (CCA) and partial least squares (PLS) are well-known techniques for feature extraction from two sets of multi-dimensional variables. The fundamental difference between CCA and PLS is that CCA maximizes the correlation while PLS maximizes the covariance. Although both CCA and PLS have been applied successfully in various applications, the intrinsic relationship between them remains unclear. In this paper, we attempt to address this issue by showing the equivalence relationship between CCA and orthonormalized partial least squares (OPLS), a variant of PLS. We further extend the equivalence relationship to the case when regularization is employed for both sets of variables. In addition, we show that the CCA projection for one set of variables is independent of the regularization on the other set of variables. We have performed experimental studies using both synthetic and real data sets and our results confirm the established equivalence relationship. The presented analysis provides novel insights into the connection between these two existing algorithms as well as the effect of the regularization.


Predictive Projections

AAAI Conferences

These existing algorithms discover projections policies in very high dimensional state spaces. of the training data under which nearby points are likely We propose a linear dimensionality reduction algorithm to have the same class label or similar regression targets. The that discovers predictive projections: projections algorithm described in this paper makes use of the same machinery in which accurate predictions of future states but attempts to find low-dimensional projections under can be made using simple nearest neighbor style which current state vectors accurately predict future states learning. The goal of this work is to extend the in the projected space. The intuition is that projections which reach of existing reinforcement learning algorithms capture the state dynamics in this way are likely to contain to domains where they would otherwise be inapplicable information that will be useful for control.


Semi-Supervised Metric Learning Using Pairwise Constraints

AAAI Conferences

Distance metric has an important role in many machine learning algorithms. Recently, metric learning for semi-supervised algorithms has received much attention. For semi-supervised clustering, usually a set of pairwise similarity and dissimilarity constraints is provided as supervisory information. Until now, various metric learning methods utilizing pairwise constraints have been proposed. The existing methods that can consider both positive (must-link) and negative (cannot-link) constraints find linear transformations or equivalently global Mahalanobis metrics. Additionally, they find metrics only according to the data points appearing in constraints (without considering other data points). In this paper, we consider the topological structure of data along with both positive and negative constraints. We propose a kernel-based metric learning method that provides a non-linear transformation. Experimental results on synthetic and real-world data sets show the effectiveness of our metric learning method.


Streamed Learning: One-Pass SVMs

AAAI Conferences

We present a streaming model for large-scale classification (in the context of ℓ2 -SVM) by leveraging connections between learning and computational geometry. The streaming model imposes the constraint that only a single pass over the data is allowed. The ℓ2 -SVM is known to have an equivalent formulation in terms of the minimum enclosing ball (MEB) problem, and an efficient algorithm based on the idea of core sets exists (CVM) [Tsang et al., 2005]. CVM learns a (1 + ε)-approximate MEB for a set of points and yields an approximate solution to corresponding SVM instance. However CVM works in batch mode requiring multiple passes over the data. This paper presents a single-pass SVM which is based on the minimum enclosing ball of streaming data. We show that the MEB updates for the streaming case can be easily adapted to learn the SVM weight vector in a way similar to using online stochastic gradient updates. Our algorithm performs polylogarithmic computation at each example, and requires very small and constant storage. Experimental results show that, even in such restrictive settings, we can learn efficiently in just one pass and get accuracies comparable to other state-of-the-art SVM solvers (batch and online). We also give an analysis of the algorithm, and discuss some open issues and possible extensions.