Goto

Collaborating Authors

 Statistical Learning


Positive Unlabeled Learning for Time Series Classification

AAAI Conferences

In many real-world applications of the time series classification problem, not only could the negative training instances be missing, the number of positive instances available for learning may also be rather limited. This has motivated the development of new classification algorithms that can learn from a small set P of labeled seed positive instances augmented with a set U of unlabeled instances (i.e. PU learning algorithms). However, existing PU learning algorithms for time series classification have less than satisfactory performance as they are unable to identify the class boundary between positive and negative instances accurately. In this paper, we propose a novel PU learning algorithm LCLC (Learning from Common Local Clusters) for time series classification. LCLC is designed to effectively identify the ground truthsโ€™ positive and negative boundaries, resulting in more accurate classifiers than those constructed using existing methods. We have applied LCLC to classify time series data from different application domains; the experimental results demonstrate that LCLC outperforms existing methods significantly.


Multi-Kernel Gaussian Processes

AAAI Conferences

Multi-task learning remains a difficult yet important problem in machine learning. In Gaussian processes the main challenge is the definition of valid kernels (covariance functions) able to capture the relationships between different tasks. This paper presents a novel methodology to construct valid multi-task covariance functions (Mercer kernels) for Gaussian processes allowing for a combination of kernels with different forms. The method is based on Fourier analysis and is general for arbitrary stationary covariance functions. Analytical solutions for cross covariance terms between popular forms are provided including Matยดern, squared exponential and sparse covariance functions. Experiments are conducted with both artificial and real datasets demonstrating the benefits of the approach.


Combining Supervised and Unsupervised Models Via Unconstrained Probabilistic Embedding

AAAI Conferences

Ensemble learning with output from multiple supervised and unsupervised models aims to improvethe classification accuracy of supervised model ensembleby jointly considering the grouping results from unsupervised models. In this paper we cast this ensemble task as an unconstrained probabilistic embedding problem. Specifically, we assume both objects and classes/clusters have latent coordinates without constraints in a D -dimensional Euclidean space, and consider the mapping from the embedded space into the space of results from supervised and unsupervised models as a probabilistic generative process. The prediction of an objectis then determined by the distances between the objectand the classes in the embedded space. A solution of this embedding can be obtained using the quasi-Newton method, resulting in the objects and classes/clusters with high co-occurrence weights being embedded close. We demonstrate the benefits of this unconstrained embedding method by three real applications.


Ball Ranking Machines for Content-Based Multimedia Retrieval

AAAI Conferences

In this paper, we propose the new Ball Ranking Machines (BRMs) to address the supervised ranking problems. In previous work, supervised ranking methods have been successfully applied in various information retrieval tasks. Among these methodologies, the Ranking Support Vector Machines (Rank SVMs) are well investigated. However, one major fact limiting their applications is that Ranking SVMs need optimize a margin-based objective function over all possible document pairs within all queries on the training set. In consequence, Ranking SVMs need select a large number of support vectors among a huge number of support vector candidates. This paper introduces a new model of of Ranking SVMs and develops an efficient approximation algorithm, which decreases the training time and generates much fewer support vectors. Empirical studies on synthetic data and content-based image/video retrieval data show that our method is comparable to Ranking SVMs in accuracy, but use much fewer ranking support vectors and significantly less training time.


Cluster Indicator Decomposition for Efficient Matrix Factorization

AAAI Conferences

We propose a new clustering based low-rank matrix approximation method, Cluster Indicator Decomposition (CID), which yields more accurate low-rank approximations than previous commonly used singular value decomposition and other Nystrรถm style decompositions. Our model utilizes the intrinsic structures of data and theoretically be more compact and accurate than the traditional low rank approximation approaches. The reconstruction in CID is extremely fast leading to a desirable advantage of our method in large-scale kernel machines (like Support Vector Machines) in which the reconstruction of the kernels needs to be frequently computed. Experimental results indicate that our approach compress images much more efficiently than other factorization based methods. We show that combining our method with Support Vector Machines obtains more accurate approximation and more accurate prediction while consuming much less computation resources.


Locality-Constrained Concept Factorization

AAAI Conferences

Matrix factorization based techniques, such as nonnegative matrix factorization (NMF) and concept factorization (CF), have attracted great attention in dimension reduction and data clustering. Both of them are linear learning problems and lead to a sparse representation of the data. However, the sparsity obtained by these methods does not always satisfy locality conditions, thus the obtained data representation is not the best. This paper introduces a locality-constrained concept factorization method which imposes a locality constraint onto the traditional concept factorization. By requiring the concepts (basis vectors) to be as close to the original data points as possible, each data can be represented by a linear combination of only a few basis concepts. Thus our method is able to achieve sparsity and locality at the same time. We demonstrate the effectiveness of this novel algorithm through a set of evaluations on real world applications.


Probit Classifiers with a Generalized Gaussian Scale Mixture Prior

AAAI Conferences

Most of the existing probit classifiers are based on sparsity-oriented modeling. However, we show that sparsity is not always desirable in practice, and only an appropriate degree of sparsity is profitable. In this work, we propose a flexible probabilistic model using a generalized Gaussian scale mixture prior that can promote an appropriate degree of sparsity for its model parameters, and yield either sparse or non-sparse estimates according to the intrinsic sparsity of features in a dataset. Model learning is carried out by an efficient modified maximum a posteriori (MAP) estimate. We also show relationships of the proposed model to existing probit classifiers as well as iteratively re-weighted l1 and l2 minimizations. Experiments demonstrate that the proposed method has better or comparable performances in feature selection for linear classifiers as well as in kernel-based classification.


Modular Community Detection in Networks

AAAI Conferences

Network community detection โ€” the problem of dividing a network of interest into clusters for intelligent analysis โ€” has recently attracted significant attention in diverse fields of research. To discover intrinsic community structure a quantitative measure called modularity has been widely adopted as an optimization objective. Unfortunately, modularity is inherently NP-hard to optimize and approximate solutions must be sought if tractability is to be ensured. In practice, a spectral relaxation method is most often adopted, after which a community partition is recovered from relaxed fractional values by a rounding process. In this paper, we propose an iterative rounding strategy for identifying the partition decisions that is coupled with a fast constrained power method that sequentially achieves tighter spectral relaxations. Extensive evaluation with this coupled relaxation-rounding method demonstrates consistent and sometimes dramatic improvements in the modularity of the communities discovered.


Learning Hash Functions for Cross-View Similarity Search

AAAI Conferences

Many applications in Multilingual and Multimodal Information Access involve searching large databases of high dimensional data objects with multiple (conditionally independent) views. In this work we consider the problem of learning hash functions for similarity search across the views for such applications. We propose a principled method for learning a hash function for each view given a set of multiview training data objects. The hash functions map similar objects to similar codes across the views thus enabling cross-view similarity search. We present results from an extensive empirical study of the proposed approach which demonstrate its effectiveness on Japanese language People Search and Multilingual People Search problems.


Incremental Slow Feature Analysis

AAAI Conferences

The Slow Feature Analysis (SFA) unsupervised learning framework extracts features representing the underlying causes of the changes within a temporally coherent high-dimensional raw sensory input signal. We develop the first online version of SFA, via a combination of incremental Principal Components Analysis and Minor Components Analysis. Unlike standard batch-based SFA, online SFA adapts along with non-stationary environments, which makes it a generally useful unsupervised preprocessor for autonomous learning agents. We compare online SFA to batch SFA in several experiments and show that it indeed learns without a teacher to encode the input stream by informative slow features representing meaningful abstract environmental properties. We extend online SFA to deep networks in hierarchical fashion, and use them to successfully extract abstract object position information from high-dimensional video.