Goto

Collaborating Authors

 Genre


Joint and Coupled Bilingual Topic Model Based Sentence Representations for Language Model Adaptation

AAAI Conferences

This paper is concerned with data selection for adapting language model (LM) in statistical machine translation (SMT), and aims to find the LM training sentences that are topic similar to the translation task. Although the traditional approaches have gained significant performance, they ignore the topic information and the distribution information of words when selecting similar training sentences. In this paper, we present two bilingual topic model (BLTM) (joint and coupled BLTM) based sentence representations for cross-lingual data selection. We map the data selection task into cross-lingual semantic representations that are language independent, then rank and select sentences in the target language LM training corpus for a sentence in the translation task by the semantics-based likelihood. The semantic representations are learned from the parallel corpus, with the assumption that the bilingual pair shares the same or similar distribution over semantic topics. Large-scale experimental results demonstrate that our approaches significantly outperform the state-of-the-art approaches on both LM perplexity and translation performance, respectively.


Opinion Target Extraction Using Partially-Supervised Word Alignment Model

AAAI Conferences

Mining opinion targets from online reviews is an important and challenging task in opinion mining. This paper proposes a novel approach to extract opinion targets by using partial-supervised word alignment model (PSWAM). At first, we apply PSWAM in a monolingual scenario to mine opinion relations in sentences and estimate the associations between words. Then, a graph-based algorithm is exploited to estimate the confidence of each candidate, and the candidates with higher confidence will be extracted as the opinion targets. Compared with existing syntax-based methods, PSWAM can effectively avoid parsing errors when dealing with informal sentences in online reviews. Compared with the methods using alignment model, PSWAM can capture opinion relations more precisely through partial supervision from partial alignment links. Moreover, when estimating candidate confidence, we make penalties on higher-degree vertices in our graph-based algorithm in order to decrease the probability of the random walk running into the unrelated regions in the graph. As a result, some errors can be avoided. The experimental results on three data sets with different sizes and languages show that our approach outperforms state-of-the-art methods.


Crowdsourcing-Assisted Query Structure Interpretation

AAAI Conferences

Structured Web search incorporating data from structured sources into search engine results has attracted much attention from both academic and industrial communities. To understand user's intent, query structure interpretation is proposed to analyze the structure of queries in a query log and map query terms to the semantically relevant attributes of data sources in a target domain. Existing methods assume all queries should be classified to the target domain, and thus they are limited when interpreting queries from different domains in real query logs. To address the problem, we introduce a human-machine hybrid method by utilizing crowdsourcing platforms. Our method selects a small number of query terms and asks the crowdsourcing workers to interpret them, and then infers the interpretations based on the crowdsourcing results. To improve the performance, we propose an iterative probabilistic inference method based on a similarity graph of query terms, and select the most useful query terms for crowdsourcing by considering their domain-relevance and gained benefit. We evaluate our method on a real query log, and the experimental results show that our method outperforms the state-of-the-art method.


Persistent Homology: An Introduction and a New Text Representation for Natural Language Processing

AAAI Conferences

Persistent homology is a mathematical tool from topological data analysis. It performs multi-scale analysis on a set of points and identifies clusters, holes, and voids therein. These latter topological structures complement standard feature representations, making persistent homology an attractive feature extractor for artificial intelligence. Research on persistent homology for AI is in its infancy, and is currently hindered by two issues: the lack of an accessible introduction to AI researchers, and the paucity of applications. In response, the first part of this paper presents a tutorial on persistent homology specifically aimed at a broader audience without sacrificing mathematical rigor. The second part contains one of the first applications of persistent homology to natural language processing. Specifically, our Similarity Filtration with Time Skeleton (SIFTS) algorithm identifies holes that can be interpreted as semantic "tie-backs" in a text document, providing a new document structure representation. We illustrate our algorithm on documents ranging from nursery rhymes to novels, and on a corpus with child and adolescent writings.


Lazy Paired Hyper-Parameter Tuning

AAAI Conferences

In virtually all machine learning applications, hyper-parameter tuning is required to maximize predictive accuracy. Such tuning is computationally expensive, and the cost is further exacerbated by the need for multiple evaluations (via cross-validation or bootstrap) at each configuration setting to guarantee statistically significant results. This paper presents a simple, general technique for improving the efficiency of hyper-parameter tuning by minimizing the number of resampled evaluations at each configuration. We exploit the fact that train-test samples can easily be \emph{matched} across candidate hyper-parameter configurations. This permits the use of paired hypothesis tests and power analysis that allow for statistically sound early elimination of suboptimal candidates to minimize the number of evaluations. Results on synthetic and real-world datasets demonstrate that our method improves over competitors for discrete parameter settings, and enhances state-of-the-art techniques for continuous parameter settings.


Bilevel Visual Words Coding for Image Classification

AAAI Conferences

Bag-of-Words approach has played an important role in recent works for image classification. In consideration of efficiency, most methods use k-means clustering to generate the codebook. The obtained codebooks often lose the cluster size and shape information with distortion errors and low discriminative power. Though some efforts have been made to optimize codebook in sparse coding, they usually incur higher computational cost. Moreover, they ignore the correlations between codes in the following coding stage, that leads to low discriminative power of the final representation. In this paper, we propose a bilevel visual words coding approach in consideration of representation ability, discriminative power and efficiency. In the bilevel codebook generation stage, k-means and an efficient spectral clustering are respectively run in each level by taking both class information and the shapes of each visual word cluster into account. To obtain discriminative representation in the coding stage, we design a certain localized coding rule with bilevel codebook to select local bases. To further achieve an efficient coding referring to this rule, an online method is proposed to efficiently learn a projection of local descriptor to the visual words in the codebook. After projection, coding can be efficiently completed by a low dimensional localized soft-assignment. Experimental results show that our proposed bilevel visual words coding approach outperforms the state-of-the-art approaches for image classification.


A Theoretic Framework of K-Means-Based Consensus Clustering

AAAI Conferences

Consensus clustering emerges as a promising solution to find cluster structures from data. As an efficient approach for consensus clustering, the K-means based method has garnered attention in the literature, but the existing research is still preliminary and fragmented. In this paper, we provide a systematic study on the framework of K-means-based Consensus Clustering (KCC). We first formulate the general definition of KCC, and then reveal a necessary and sufficient condition for utility functions that work for KCC, on both complete and incomplete basic partitionings. Experimental results on various real-world data sets demonstrate that KCC is highly efficient and is comparable to the state-of-the-art methods in terms of clustering quality. In addition, KCC shows high robustness to incomplete basic partitionings with substantial missing values.


An Ensemble of Bayesian Networks for Multilabel Classification

AAAI Conferences

We present a novel approach for multilabel classification based on an ensemble of Bayesian networks. The class variables are connected by a tree; each model of the ensemble uses a different class as root of the tree. We assume the features to be conditionally independent given the classes, thus generalizing the naive Bayes assumption to the multiclass case. This assumption allows us to optimally identify the correlations between classes and features; such correlations are moreover shared across all models of the ensemble. Inferences are drawn from the ensemble via logarithmic opinion pooling. To minimize Hamming loss, we compute the marginal probability of the classes by running standard inference on each Bayesian network in the ensemble, and then pooling the inferences. To instead minimize the subset 0/1 loss, we pool the joint distributions of each model and cast the problem as a MAP inference in the corresponding graphical model. Experiments show that the approach is competitive with state-of-the-art methods for multilabel classification.


Transition Constraints: A Study on the Computational Complexity of Qualitative Change

AAAI Conferences

Many formalisms discussed in the literature on qualitative spatial reasoning are designed for expressing static spatial constraints only. However, dynamic situations arise in virtually all applications of these formalisms, which makes it necessary to study variants and extensions involving change. This paper presents a study on the computational complexity of qualitative change. More precisely, we discuss the reasoning task of finding a solution to a temporal sequence of static reasoning problems where this sequence is subject to additional transition constraints. Our focus is primarily on smoothness and continuity constraints: we show how such transitions can be defined as relations and expressed within qualitative constraint formalisms. Our results demonstrate that for point-based constraint formalisms the interesting fragments become NP-completein the presence of continuity constraints, even if the satisfiability problem of its static descriptions is tractable.


Minimizing Writes in Parallel External Memory Search

AAAI Conferences

Recent research on external-memory search has shown that disks can be effectively usedas secondary storage when performing large breadth-first searches.We introduce the Write-Minimizing Breadth-First Search (WMBFS) algorithm which is designed to minimizethe number of writes performed in an external-memory BFS. WMBFS is also designed to store the results ofthe BFS for later use.We present the results of a BFS on a single-agent version of Chinese Checkers and the Rubik's Cube edge cubes, state spaceswith about 1 trillion states each. In evaluating against a comparable approach, WMBFS reduces the I/O for the Chinese Checkers domain by over an order of magnitude.In Rubik's cube, in addition to reducing I/O, the search is also 3.5 times faster.Analysis of the results suggests the machine and state-space properties necessary for WMBFS to perform well.