Goto

Collaborating Authors

 Genre


Semi-Supervised Kernel Matching for Domain Adaptation

AAAI Conferences

In this paper, we propose a semi-supervised kernel matching method to address domain adaptation problems where the source distribution substantially differs from the target distribution. Specifically, we learn a prediction function on the labeled source data while mapping the target data points to similar source data points by matching the target kernel matrix to a submatrix of the source kernel matrix based on a Hilbert Schmidt Independence Criterion. We formulate this simultaneous learning and mapping process as a non-convex integer optimization problem and present a local minimization procedure for its relaxed continuous form. Our empirical results show the proposed kernel matching method significantly outperforms alternative methods on the task of across domain sentiment classification.


Colorization by Matrix Completion

AAAI Conferences

Given a monochrome image and some manually labeled pixels, the colorization problem is a computer-assisted process of adding color to the monochrome image. This paper proposes a novel approach to the colorization problem by formulating it as a matrix completion problem. In particular, taking a monochrome image and parts of the color pixels (labels) as inputs, we develop a robust colorization model and resort to an augmented Lagrange multiplier algorithm for solving the model. Our approach is based on the fact that a matrix can be represented as a low-rank matrix plus a sparse matrix. Our approach is effective because it is able to handle the potential noises in the monochrome image and outliers in the labels. To improve the performance of our method, we further incorporate a so-called local-color-consistency idea into our method. Empirical results on real data sets are encouraging.


Learning from Demonstration for Goal-Driven Autonomy

AAAI Conferences

Goal-driven autonomy (GDA) is a conceptual model for creating an autonomous agent that monitors a set of expectations during plan execution, detects when discrepancies occur, builds explanations for the cause of failures, and formulates new goals to pursue when planning failures arise. While this framework enables the development of agents that can operate in complex and dynamic environments, implementing the logic for each of the subtasks in the model requires substantial domain engineering. We present a method using case-based reasoning and intent recognition in order to build GDA agents that learn from demonstrations. Our approach reduces the amount of domain engineering necessary to implement GDA agents and learns expectations, explanations, and goals from expert demonstrations. We have applied this approach to build an agent for the real-time strategy game StarCraft. Our results show that integrating the GDA conceptual model into the agent greatly improves its win rate.


Manifold Warping: Manifold Alignment over Time

AAAI Conferences

Knowledge transfer is computationally challenging, due in part to the curse of dimensionality, compounded by source and target domains expressed using different features (e.g., documents written in different languages). Recent work on manifold learning has shown that data collected in real-world settings often have high-dimensional representations, but lie on low-dimensional manifolds. Furthermore, data sets collected from similar generating processes often present different high-dimensional views, even though their underlying manifolds are similar. The ability to align these data sets and extract this common structure is critical for many transfer learning tasks. In this paper, we present a novel framework for aligning two sequentially-ordered data sets, taking advantage of a shared low-dimensional manifold representation. Our approach combines traditional manifold alignment and dynamic time warping algorithms using alternating projections. We also show that the previously-proposed canonical time warping algorithm is a special case of our approach. We provide a theoretical formulation as well as experimental results on synthetic and real-world data, comparing manifold warping to other alignment methods.


Markov Network Structure Learning: A Randomized Feature Generation Approach

AAAI Conferences

The structure of a Markov network is typically learned in one of two ways. The first approach is to treat this task as a global search problem. However, these algorithms are slow as they require running the expensive operation of weight (i.e., parameter) learning many times. The second approach involves learning a set of local models and then combining them into a global model. However, it can be computationally expensive to learn the local models for datasets that contain a large number of variables and/or examples. This paper pursues a third approach that views Markov network structure learning as a feature generation problem. The algorithm combines a data-driven, specific-to-general search strategy with randomization to quickly generate a large set of candidate features that all have support in the data. It uses weight learning, with L1 regularization, to select a subset of generated features to include in the model. On a large empirical study, we find that our algorithm is equivalently accurate to other state-of-the-art methods while exhibiting a much faster run time.


Name-Ethnicity Classification and Ethnicity-Sensitive Name Matching

AAAI Conferences

Personal names are important and common information in many data sources, ranging from social networks and news articles to patient records and scientific documents.They are often used as queries for retrieving records and also as key information for linking documents from multiple sources. Matching personal names can be challenging due to variations in spelling and various formatting of names. While many approximated name matching techniques have been proposed, most are generic string-matching algorithms. Unlike other types of proper names, personal names are highly cultural. Many ethnicities have their own unique naming systems and identifiable characteristics. In this paper we explore such relationships between ethnicities and personal names to improve the name matching performance. First, we propose a name-ethnicity classifier based on the multinomial logistic regression. Our model can effectively identify name-ethnicity from personal names in Wikipedia, which we use to define name-ethnicity, to within 85\% accuracy.Next, we propose a novel alignment-based name matching algorithm, based on Smithโ€“Waterman algorithm and logistic regression.Different name matching models are then trained for different name-ethnicity groups.Our preliminary experimental result on DBLP's disambiguated author dataset yields a performance of 99\% precision and 89\% recall.Surprisingly, textual features carry more weight than phonetic ones in name-ethnicity classification.


Convex Matching Pursuit for Large-Scale Sparse Coding and Subset Selection

AAAI Conferences

In this paper, a new convex matching pursuit scheme is proposed for tackling large-scale sparse coding and subset selection problems. In contrast with current matching pursuit algorithms such as subspace pursuit (SP), the proposed algorithm has a convex formulation and guarantees that the objective value can be monotonically decreased. Moreover, theoretical analysis and experimental results show that the proposed method achieves better scalability while maintaining similar or better decoding ability compared with state-of-the-art methods on large-scale problems.


Investigating the Effectiveness of Laplacian-Based Kernels in Hub Reduction

AAAI Conferences

A โ€œhubโ€ is an object closely surrounded by, or very similar to, many other objects in the dataset. Recent studies by Radovanoviยดc et al. indicate that in high dimensional spaces, hubs almost always emerge, and objects close to the data centroid tend to become hubs. In this paper, we show that the family of kernels based on the graph Laplacian makes all objects in the dataset equally similar to the centroid, and thus they are expected to make less hubs when used as a similarity measure. We investigate this hypothesis using both synthetic and real-world data. It turns out that these kernels suppress hubs in some cases but not always, and the results seem to be affected by the size of the dataโ€”a factor not discussed previously. However, for the datasets in which hubs are indeed reduced by the Laplacian-based kernels, these kernels work well in ranking and classification tasks. This result suggests that the amount of hubs, which can be readily computed in an unsupervised fashion, can be a yardstick of whether Laplacian-based kernels work effectively for a given data.


Sequence Labeling with Non-Negative Weighted Higher Order Features

AAAI Conferences

In sequence labeling, using higher order features leads to high inference complexity. A lot of studies have been conducted to address this problem. In this paper, we propose a new exact decoding algorithm under the assumption that weights of all higher order features are non-negative. In the worst case, the time complexity of our algorithm is quadratic on the number of higher order features. Comparing with existing algorithms, our method is more efficient and easier to implement. We evaluate our method on two sequence labeling tasks: Optical Character Recognition and Chinese part-of-speech tagging. Our experimental results demonstrate that adding higher order features significantly improves the performance while requiring only 30% additional inference time.


Leveraging Domain Knowledge in Multitask Bayesian Network Structure Learning

AAAI Conferences

Network structure learning algorithms have aided network discovery in fields such as bioinformatics, neuroscience, ecology and social science. However, challenges remain in learning informative networks for related sets of tasks because the search space of Bayesian network structures is characterized by large basins of approximately equivalent solutions. Multitask algorithms select a set of networks that are near each other in the search space, rather than a score-equivalent set of networks chosen from independent regions of the space. This selection preference allows a domain expert to see only differences supported by the data. However, the usefulness of these algorithms for scientific datasets is limited because existing algorithms naively assume that all pairs of tasks are equally related. We introduce a framework that relaxes this assumption by incorporating domain knowledge about task-relatedness into the learning objective. Using our framework, we introduce the first multitask Bayesian network algorithm that leverages domain knowledge about the relatedness of tasks. We use our algorithm to explore the effect of task-relatedness on network discovery and show that our algorithm learns networks that are closer to ground truth than naive algorithms and that our algorithm discovers patterns that are interesting.