Goto

Collaborating Authors

 Europe


Domain Adaptation via Transfer Component Analysis

AAAI Conferences

Domain adaptation solves a learning problem in a target domain by utilizing the training data in a different but related source domain. Intuitively, discovering a good feature representation across domains is crucial. In this paper, we propose to find such a representation through a new learning method, transfer component analysis ( TCA ), for domain adaptation. TCA tries to learn some transfer components across domains in a Reproducing Kernel Hilbert Space (RKHS) using Maximum Mean Discrepancy (MMD). In the subspace spanned by these transfer components , data distributions in different domains are close to each other. As a result, with the new representations in this subspace, we can apply standard machine learning methods to train classifiers or regression models in the source domain for use in the target domain. The main contribution of our work is that we propose a novel feature representation in which to perform domain adaptation via a new parametric kernel using feature extraction methods, which can dramatically minimize the distance between domain distributions by projecting data onto the learned transfer components . Furthermore, our approach can handle large datsets and naturally lead to out-of-sample generalization. The effectiveness and efficiency of our approach in are verified by experiments on two real-world applications: cross-domain indoor WiFi localization and cross-domain text classification.


Spectral Kernel Learning for Semi-Supervised Classification

AAAI Conferences

Typical graph-theoretic approaches for semi-supervised classification infer labels of unlabeled instances with the help of graph Laplacians. Founded on the spectral decomposition of the graph Laplacian, this paper learns a kernel matrix via minimizing the leave-one-out classification error on the labeled instances.ย  To this end, an efficient algorithm is presented based on linear programming, resulting in a transductive spectral kernel. The idea of our algorithm stems from regularization methodology and also has a nice interpretation in terms of spectral clustering. A simple classifier can be readily built upon the learned kernel, which suffices to give prediction for any data point aside from those in the available dataset. Besides this usage, the spectral kernel can be effectively used in tandem with conventional kernel machines such as SVMs. We demonstrate the efficacy of the proposed algorithm through experiments carried out on challenging classification tasks.


Learning the Optimal Neighborhood Kernel for Classification

AAAI Conferences

Kernel methods have been applied successfully in many applications. The kernel matrix plays an important role in kernel-based learning methods, but the ideal kernel matrix is usually unknown in practice and needs to be estimated. In this paper, we propose to directly learn the ideal kernel matrix (called the optimal neighborhood kernel matrix) from a pre-specified kernel matrix for improved classification performance. We assume that the pre-specified kernel matrix generated from the specific application is a noisy observation of the ideal one. The resulting optimal neighborhood kernel matrix is shown to be the summation of the pre-specified kernel matrix and a rank-one matrix. We formulate the problem of learning the optimal neighborhood kernel as a constrained quartic problem, and propose to solve it using two methods: level method and constrained gradient descent. Empirical results on several benchmark data sets demonstrate the efficiency and effectiveness of the proposed algorithms.


Boosting Constrained Mutual Subspace Method for Robust Image-set Based Object Recognition

AAAI Conferences

Object recognition using image-set or video sequence as input tends to be more robust since image-set or video sequence provides much more information than single snap-shot about the variability in the appearance of the target subject. Constrained Mutual Subspace Method (CMSM) is one of the state-of-the-art algorithms for imageset based object recognition by first projecting the image-set patterns onto the so-called generalized difference subspace then classifying based on the principal angle based mutual subspace distance. By treating the subspace bases for each image-set patterns as basic elements in the grassmann manifold, this paper presents a framework for robust image-set based recognition by CMSM based ensemble learning in a boosting way. The proposed Boosting Constrained Mutual Subspace Method(BCMSM) improves the original CMSM in the following ways: a) The proposed BCMSM algorithm is insensitive to the dimension of the generalized differnce subspace while the performance of the original CMSM algorithm is quite dependent on the dimension and the selecting of optimum choice is quite empirical and case-dependent; b) By taking advantage of both boosting and CMSM techniques, the generalization ability is improved and much higher classification performance can be achieved. Extensive experiments on real-life data sets (two face recognition tasks and one 3D object category classification task) show that the proposed method outperforms the previous state-of-the-art algorithms greatly in terms of classification accuracy.


Exploiting Multi-Modal Interactions: A Unified Framework

AAAI Conferences

Given an imagebase with tagged images, four types of tasks an be executed, i.e., content-based image retrieval, image annotation, text-based image retrieval, and query expansion. For any of these tasks the similarity on the concerned type of objects is essential. In this paper, we propose a framework to tackle these four tasks from a unified view. The essence of the framework is to estimate similarities by exploiting the interactions between objects of different modality. Experiments show that the proposed method can improve similarity estimation, and based on the improved similarity estimation, some simple methods can achieve better performances than some state-of-the-art techniques.


Unsupervised Rank Aggregation with Domain-Specific Expertise

AAAI Conferences

Consider the setting where a panel of judges is repeatedly asked to (partially) rank sets of objects according to given criteria, and assume that the judges' expertise depends on the objects' domain.ย  Learning to aggregate their rankings with the goal of producing a better joint ranking is a fundamental problem in many areas of Information Retrieval and Natural Language Processing, amongst others.ย  However, supervised ranking data is generally difficult to obtain, especially if coming from multiple domains.ย  Therefore, we propose a framework for learning to aggregate votes of constituent rankers with domain specific expertise without supervision.ย  We apply the learning framework to the settings of aggregating full rankings and aggregating top-k lists, demonstrating significant improvements over a domain-agnostic baseline in both cases.


Local Query Mining in a Probabilistic Prolog

AAAI Conferences

Local pattern mining is concerned with finding the set of patterns that satisfy a constraint in a database. We study local pattern mining in the context of ProbLog, a probabilistic Prolog system, and introduce an approach for finding correlated patterns in the form of queries in such a Prolog system. The approach combines principles of inductive logic programming, data mining and statistical relational learning. Experiments on a challenging biological network mining task provide evidence for the interestingness of the approach.


gRegress: Extracting Features from Graph Transactions for Regression

AAAI Conferences

In this work we propose gRegress, a new algorithm which given set of labeled graphs and a real value associated with each graph extracts the complete set of subgraphs such that a) each subgraph in this set has correlation with the real value above a user-specified threshold and b) each subgraph in this set has correlation with any other subgraph in the set below a user-specified threshold. gRegress incorporates novel pruning mechanisms based on correlation of a subgraph feature with the output and correlation with other subgraph features. These pruning mechanisms lead to significant speedup. Experimental results indicate that in terms of runtime, gRegress substantially outperforms gSpan, often by an order of magnitude while the regression models produced by both approaches have comparable accuracy.


Local Learning Regularized Nonnegative Matrix Factorization

AAAI Conferences

Nonnegative Matrix Factorization (NMF) has been widely used in machine learning and data mining. It aims to find two nonnegative matrices whose product can well approximate the nonnegative data matrix, which naturally lead to parts-based representation. In this paper, we present a local learning regularized nonnegative matrix factorization (LLNMF) for clustering. It imposes an additional constraint on NMF that the cluster label of each point can be predicted by the points in its neighborhood. This constraint encodes both the discriminative information and the geometric structure, and is good at clustering data on manifold. An iterative multiplicative updating algorithm is proposed to optimize the objective, and its convergence is guaranteed theoretically. Experiments on many benchmark data sets demonstrate that the proposed method outperforms NMF as well as many state of the art clustering methods.


Angluin-Style Learning of NFA

AAAI Conferences

We introduce NL*, a learning algorithm for inferring non-deterministic finite-state automata using membership and equivalence queries. More specifically, residual finite-state automata (RFSA) are learned similarly as in Angluin's popular L* algorithm, which, however, learns deterministic finite-state automata (DFA). Like in a DFA, the states of an RFSA represent residual languages. Unlike a DFA, an RFSA restricts to prime residual languages, which cannot be described as the union of other residual languages. In doing so, RFSA can be exponentially more succinct than DFA. They are, therefore, the preferable choice for many learning applications. The implementation of our algorithm is applied to a collection of examples and confirms the expected advantage of NL* over L*.