AITopics

This paper is concerned with data selection for adapting language model (LM) in statistical machine translation (SMT), and aims to find the LM training sentences that are topic similar to the translation task. Although the traditional approaches have gained significant performance, they ignore the topic information and the distribution information of words when selecting similar training sentences. In this paper, we present two bilingual topic model (BLTM) (joint and coupled BLTM) based sentence representations for cross-lingual data selection. We map the data selection task into cross-lingual semantic representations that are language independent, then rank and select sentences in the target language LM training corpus for a sentence in the translation task by the semantics-based likelihood. The semantic representations are learned from the parallel corpus, with the assumption that the bilingual pair shares the same or similar distribution over semantic topics. Large-scale experimental results demonstrate that our approaches significantly outperform the state-of-the-art approaches on both LM perplexity and translation performance, respectively.

bilingual topic model, language model adaptation, sentence representation

Twenty-Third International Joint Conference on Artificial Intelligence

Genre: Research Report (0.53)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Opinion Target Extraction Using Partially-Supervised Word Alignment Model

Liu, Kang (Chinese Academy of Sciences) | Xu, Heng Li (Chinese Academy of Sciences) | Liu, Yang (Chinese Academy of Sciences) | Zhao, Jun (Chinese Academy of Sciences)

opinion target extraction, partially-supervised word alignment model

Mining opinion targets from online reviews is an important and challenging task in opinion mining. This paper proposes a novel approach to extract opinion targets by using partial-supervised word alignment model (PSWAM). At first, we apply PSWAM in a monolingual scenario to mine opinion relations in sentences and estimate the associations between words. Then, a graph-based algorithm is exploited to estimate the confidence of each candidate, and the candidates with higher confidence will be extracted as the opinion targets. Compared with existing syntax-based methods, PSWAM can effectively avoid parsing errors when dealing with informal sentences in online reviews. Compared with the methods using alignment model, PSWAM can capture opinion relations more precisely through partial supervision from partial alignment links. Moreover, when estimating candidate confidence, we make penalties on higher-degree vertices in our graph-based algorithm in order to decrease the probability of the random walk running into the unrelated regions in the graph. As a result, some errors can be avoided. The experimental results on three data sets with different sizes and languages show that our approach outperforms state-of-the-art methods.

Twenty-Third International Joint Conference on Artificial Intelligence

Genre: Research Report > Promising Solution (0.53)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.73)

Crowdsourcing-Assisted Query Structure Interpretation

Han, Jun (Tsinghua University) | Fan, Ju (National University of Singapore) | Zhou, Lizhu (Tsinghua University)

crowdsourcing-assisted query structure interpretation

Structured Web search incorporating data from structured sources into search engine results has attracted much attention from both academic and industrial communities. To understand user's intent, query structure interpretation is proposed to analyze the structure of queries in a query log and map query terms to the semantically relevant attributes of data sources in a target domain. Existing methods assume all queries should be classified to the target domain, and thus they are limited when interpreting queries from different domains in real query logs. To address the problem, we introduce a human-machine hybrid method by utilizing crowdsourcing platforms. Our method selects a small number of query terms and asks the crowdsourcing workers to interpret them, and then infers the interpretations based on the crowdsourcing results. To improve the performance, we propose an iterative probabilistic inference method based on a similarity graph of query terms, and select the most useful query terms for crowdsourcing by considering their domain-relevance and gained benefit. We evaluate our method on a real query log, and the experimental results show that our method outperforms the state-of-the-art method.

Twenty-Third International Joint Conference on Artificial Intelligence

Genre: Research Report (0.53)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Information Management (0.87)
Information Technology > Artificial Intelligence (0.87)

Zhu, Xiaojin (University of Wisconsin-Madison)

Persistent Homology: An Introduction and a New Text Representation for Natural Language Processing

Persistent homology is a mathematical tool from topological data analysis. It performs multi-scale analysis on a set of points and identifies clusters, holes, and voids therein. These latter topological structures complement standard feature representations, making persistent homology an attractive feature extractor for artificial intelligence. Research on persistent homology for AI is in its infancy, and is currently hindered by two issues: the lack of an accessible introduction to AI researchers, and the paucity of applications. In response, the first part of this paper presents a tutorial on persistent homology specifically aimed at a broader audience without sacrificing mathematical rigor. The second part contains one of the first applications of persistent homology to natural language processing. Specifically, our Similarity Filtration with Time Skeleton (SIFTS) algorithm identifies holes that can be interpreted as semantic "tie-backs" in a text document, providing a new document structure representation. We illustrate our algorithm on documents ranging from nursery rhymes to novels, and on a corpus with child and adolescent writings.

natural language processing, new text representation, persistent homology

Twenty-Third International Joint Conference on Artificial Intelligence

Genre: Instructional Material > Course Syllabus & Notes (0.53)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Zheng, Alice X (Microsoft Research, Redmond) | Bilenko, Mikhail (Microsoft Research, Redmond)

Lazy Paired Hyper-Parameter Tuning

In virtually all machine learning applications, hyper-parameter tuning is required to maximize predictive accuracy. Such tuning is computationally expensive, and the cost is further exacerbated by the need for multiple evaluations (via cross-validation or bootstrap) at each configuration setting to guarantee statistically significant results. This paper presents a simple, general technique for improving the efficiency of hyper-parameter tuning by minimizing the number of resampled evaluations at each configuration. We exploit the fact that train-test samples can easily be \emph{matched} across candidate hyper-parameter configurations. This permits the use of paired hypothesis tests and power analysis that allow for statistically sound early elimination of suboptimal candidates to minimize the number of evaluations. Results on synthetic and real-world datasets demonstrate that our method improves over competitors for discrete parameter settings, and enhances state-of-the-art techniques for continuous parameter settings.

lazy paired hyper-parameter tuning

Twenty-Third International Joint Conference on Artificial Intelligence

Genre: Research Report (0.53)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.53)

Bilevel Visual Words Coding for Image Classification

Zhang, Jiemi (Zhejiang University) | Wu, Chenxia (Zhejiang University) | Cai, Deng (Zhejiang University) | Zhu, Jianke (Zhejiang University)

Bag-of-Words approach has played an important role in recent works for image classification. In consideration of efficiency, most methods use k-means clustering to generate the codebook. The obtained codebooks often lose the cluster size and shape information with distortion errors and low discriminative power. Though some efforts have been made to optimize codebook in sparse coding, they usually incur higher computational cost. Moreover, they ignore the correlations between codes in the following coding stage, that leads to low discriminative power of the final representation. In this paper, we propose a bilevel visual words coding approach in consideration of representation ability, discriminative power and efficiency. In the bilevel codebook generation stage, k-means and an efficient spectral clustering are respectively run in each level by taking both class information and the shapes of each visual word cluster into account. To obtain discriminative representation in the coding stage, we design a certain localized coding rule with bilevel codebook to select local bases. To further achieve an efficient coding referring to this rule, an online method is proposed to efficiently learn a projection of local descriptor to the visual words in the codebook. After projection, coding can be efficiently completed by a low dimensional localized soft-assignment. Experimental results show that our proposed bilevel visual words coding approach outperforms the state-of-the-art approaches for image classification.

bilevel visual word coding, image classification

Twenty-Third International Joint Conference on Artificial Intelligence

Genre: Research Report (0.53)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.80)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.80)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.73)

A Theoretic Framework of K-Means-Based Consensus Clustering

Wu, Junjie (Beihang University) | Liu, Hongfu (Beihang University) | Xiong, Hui (Rutgers University) | Cao, Jie ( Nanjing University of Finance and Economics )

Consensus clustering emerges as a promising solution to find cluster structures from data. As an efficient approach for consensus clustering, the K-means based method has garnered attention in the literature, but the existing research is still preliminary and fragmented. In this paper, we provide a systematic study on the framework of K-means-based Consensus Clustering (KCC). We first formulate the general definition of KCC, and then reveal a necessary and sufficient condition for utility functions that work for KCC, on both complete and incomplete basic partitionings. Experimental results on various real-world data sets demonstrate that KCC is highly efficient and is comparable to the state-of-the-art methods in terms of clustering quality. In addition, KCC shows high robustness to incomplete basic partitionings with substantial missing values.

k-means-based consensus clustering, theoretic framework

Twenty-Third International Joint Conference on Artificial Intelligence

Genre: Research Report > Promising Solution (0.53)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.80)

An Ensemble of Bayesian Networks for Multilabel Classification

Antonucci, Alessandro (IDSIA) | Corani, Giorgio (IDSIA) | Maua' (IDSIA) | , Denis Deratani (ISIN-SUPSI) | Gabaglio, Sandra

We present a novel approach for multilabel classification based on an ensemble of Bayesian networks. The class variables are connected by a tree; each model of the ensemble uses a different class as root of the tree. We assume the features to be conditionally independent given the classes, thus generalizing the naive Bayes assumption to the multiclass case. This assumption allows us to optimally identify the correlations between classes and features; such correlations are moreover shared across all models of the ensemble. Inferences are drawn from the ensemble via logarithmic opinion pooling. To minimize Hamming loss, we compute the marginal probability of the classes by running standard inference on each Bayesian network in the ensemble, and then pooling the inferences. To instead minimize the subset 0/1 loss, we pool the joint distributions of each model and cast the problem as a MAP inference in the corresponding graphical model. Experiments show that the approach is competitive with state-of-the-art methods for multilabel classification.

bayesian network, ensemble, multilabel classification

Twenty-Third International Joint Conference on Artificial Intelligence

Genre: Research Report > Promising Solution (0.53)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.80)

Transition Constraints: A Study on the Computational Complexity of Qualitative Change

Westphal, Matthias (University of Freiburg) | Hué, Julien (University of Freiburg) | Wölfl, Stefan (University of Freiburg) | Nebel, Bernhard (University of Freiburg)

Many formalisms discussed in the literature on qualitative spatial reasoning are designed for expressing static spatial constraints only. However, dynamic situations arise in virtually all applications of these formalisms, which makes it necessary to study variants and extensions involving change. This paper presents a study on the computational complexity of qualitative change. More precisely, we discuss the reasoning task of finding a solution to a temporal sequence of static reasoning problems where this sequence is subject to additional transition constraints. Our focus is primarily on smoothness and continuity constraints: we show how such transitions can be defined as relations and expressed within qualitative constraint formalisms. Our results demonstrate that for point-based constraint formalisms the interesting fragments become NP-completein the presence of continuity constraints, even if the satisfiability problem of its static descriptions is tractable.

computational complexity, qualitative change, transition constraint

Twenty-Third International Joint Conference on Artificial Intelligence

Genre: Research Report > New Finding (0.53)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.73)

Sturtevant, Nathan R. (University of Denver) | Rutherford, Matthew J. (University of Denver)

Minimizing Writes in Parallel External Memory Search

Recent research on external-memory search has shown that disks can be effectively usedas secondary storage when performing large breadth-first searches.We introduce the Write-Minimizing Breadth-First Search (WMBFS) algorithm which is designed to minimizethe number of writes performed in an external-memory BFS. WMBFS is also designed to store the results ofthe BFS for later use.We present the results of a BFS on a single-agent version of Chinese Checkers and the Rubik's Cube edge cubes, state spaceswith about 1 trillion states each. In evaluating against a comparable approach, WMBFS reduces the I/O for the Chinese Checkers domain by over an order of magnitude.In Rubik's cube, in addition to reducing I/O, the search is also 3.5 times faster.Analysis of the results suggests the machine and state-space properties necessary for WMBFS to perform well.

minimizing write, parallel external memory search

Twenty-Third International Joint Conference on Artificial Intelligence

Genre: Research Report > New Finding (0.53)

Industry: Leisure & Entertainment > Games > Chinese Checkers (0.44)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.73)