Asia
A Local Non-Negative Pursuit Method for Intrinsic Manifold Structure Preservation
Chen, Dongdong (Sichuan University) | Lv, Jian Cheng (Sichuan University) | Yi, Zhang (Sichuan University)
The local neighborhood selection plays a crucial role for most representation based manifold learning algorithms. This paper reveals that an improper selection of neighborhood for learning representation will introduce negative components in the learnt representations. Importantly, the representations with negative components will affect the intrinsic manifold structure preservation. In this paper, a local non-negative pursuit (LNP) method is proposed for neighborhood selection and non-negative representations are learnt. Moreover, it is proved that the learnt representations are sparse and convex. Theoretical analysis and experimental results show that the proposed method achieves or outperforms the state-of-the-art results on various manifold learning problems.
Dynamic Bayesian Probabilistic Matrix Factorization
Chatzis, Sotirios (Cyprus University of Technology)
Collaborative filtering algorithms generally rely on the assumption that user preference patterns remain stationary. However, real-world relational data are seldom stationary. User preference patterns may change over time, giving rise to the requirement of designing collaborative filtering systems capable of detecting and adapting to preference pattern shifts. Motivated by this observation, in this paper we propose a dynamic Bayesian probabilistic matrix factorization model, designed for modeling time-varying distributions. Formulation of our model is based on imposition of a dynamic hierarchical Dirichlet process (dHDP) prior over the space of probabilistic matrix factorization models to capture the time-evolving statistical properties of modeled sequential relational datasets. We develop a simple Markov Chain Monte Carlo sampler to perform inference. We present experimental results to demonstrate the superiority of our temporal model.
Distribution-Aware Sampling and Weighted Model Counting for SAT
Chakraborty, Supratik (Indian Institute of Technology, Bombay) | Fremont, Daniel J. (University of California, Berkeley) | Meel, Kuldeep S. (Rice University) | Seshia, Sanjit A. (University of Califonia, Berkeley) | Vardi, Moshe Y. (Rice University)
Given a CNF formula and a weight for each assignment of values tovariables, two natural problems are weighted model counting anddistribution-aware sampling of satisfying assignments. Both problems have a wide variety of important applications. Due to the inherentcomplexity of the exact versions of the problems, interest has focusedon solving them approximately. Prior work in this area scaled only tosmall problems in practice, or failed to provide strong theoreticalguarantees, or employed a computationally-expensive most-probable-explanation ({\MPE}) queries that assumes prior knowledge of afactored representation of the weight distribution. We identify a novel parameter,\emph{tilt}, which is the ratio of the maximum weight of satisfying assignment to minimum weightof satisfying assignment and present anovel approach that works with a black-box oracle for weights ofassignments and requires only an {\NP}-oracle (in practice, a {\SAT}-solver) to solve both thecounting and sampling problems when the tilt is small. Our approach provides strong theoretical guarantees, and scales toproblems involving several thousand variables. We also show that theassumption of small tilt can be significantly relaxed while improving computational efficiency if a factored representation of the weights is known.
Combining Multiple Correlated Reward and Shaping Signals by Measuring Confidence
Brys, Tim (Vrije Universiteit Brussel) | Nowé, Ann (Vrije Universiteit Brussel) | Kudenko, Daniel (University of York) | Taylor, Matthew E. (Washington State University)
Multi-objective problems with correlated objectives are a class of problems that deserve specific attention. In contrast to typical multi-objective problems, they do not require the identification of trade-offs between the objectives, as (near-) optimal solutions for any objective are (near-) optimal for every objective. Intelligently combining the feedback from these objectives, instead of only looking at a single one, can improve optimization. This class of problems is very relevant in reinforcement learning, as any single-objective reinforcement learning problem can be framed as such a multi-objective problem using multiple reward shaping functions. After discussing this problem class, we propose a solution technique for such reinforcement learning problems, called adaptive objective selection. This technique makes a temporal difference learner estimate the Q-function for each objective in parallel, and introduces a way of measuring confidence in these estimates. This confidence metric is then used to choose which objective's estimates to use for action selection. We show significant improvements in performance over other plausible techniques on two problem domains. Finally, we provide an intuitive analysis of the technique's decisions, yielding insights into the nature of the problems being solved.
Multilabel Classification with Label Correlations and Missing Labels
Bi, Wei (Hong Kong University of Science and Technology) | Kwok, James T (Hong Kong University of Science and Technology)
Many real-world applications involve multilabel classification, in which the labels can have strong inter-dependencies and some of them may even be missing.Existing multilabel algorithms are unable to handle both issues simultaneously.In this paper, we propose a probabilistic model that can automatically learn and exploit multilabel correlations.By integrating out the missing information, it also provides a disciplinedapproach to the handling of missing labels. The inference procedure is simple, and the optimization subproblems are convex. Experiments on a number of real-world data sets with both complete and missing labelsdemonstrate that the proposed algorithm can consistently outperform state-of-the-art multilabel classification algorithms.
Supervised Transfer Sparse Coding
Al-Shedivat, Maruan (King Abdullah University of Science and Technology) | Wang, Jim Jing-Yan (University at Buffalo, The State University of New York) | Alzahrani, Majed (King Abdullah University of Science and Technology) | Huang, Jianhua Z. (Texas A&M University) | Gao, Xin (King Abdullah University of Science and Technology)
A combination of the sparse coding and transfer learning techniques was shown to be accurate and robust in classification tasks where training and testing objects have a shared feature space but are sampled from different underlying distributions, i.e., belong to different domains. The key assumption in such case is that in spite of the domain disparity, samples from different domains share some common hidden factors. Previous methods often assumed that all the objects in the target domain are unlabeled, and thus the training set solely comprised objects from the source domain. However, in real world applications, the target domain often has some labeled objects, or one can always manually label a small number of them. In this paper, we explore such possibility and show how a small number of labeled data in the target domain can significantly leverage classification accuracy of the state-of-the-art transfer sparse coding methods. We further propose a unified framework named supervised transfer sparse coding (STSC) which simultaneously optimizes sparse representation, domain transfer and classification. Experimental results on three applications demonstrate that a little manual labeling and then learning the model in a supervised fashion can significantly improve classification accuracy.
Mind the Gap: Machine Translation by Minimizing the Semantic Gap in Embedding Space
Zhang, Jiajun (Chinese Academy of Sciences) | Liu, Shujie (Microsoft Research Asia) | Li, Mu (Microsoft Research Asia) | Zhou, Ming (Microsoft Research Asia) | Zong, Chengqing (Chinese Academy of Sciences)
The conventional statistical machine translation (SMT) models, such as phrase-based models (Koehn et al. 2007), formal syntax-based models (Chiang 2007; Xiong, Liu, and Aiming at retaining the semantic meaning during the Lin 2006) and linguistically syntax-based models (Liu, Liu, translation process, we propose a Recursive Neural Network and Lin 2006; Huang, Knight, and Joshi 2006; Galley et al. (RNN) based translation model. Like the previous SMT 2006; Zhang et al. 2008), perform the decoding process and models, the RNN-based model induces the translation rules generate the translation result by compositing a set of translation from the bitexts. Unlike them, the RNN-based model learns rules which are associated with high probabilities. The how to represent each lexical translation rule with two compact probabilities of the translation rules (e.g. the phrasal translation semantic vectors, and learns how to perform decoding probabilities and the lexical weights in phrase-based using the merging type (swap or monotone) dependent recursive and formal syntax-based models) are all computed based on neural networks that attempt to find the best translation the cooccurrence statistics of the rule's source-and targetsides candidate having the minimal semantic gap with the source in the bilingual corpus.
SUIT: A Supervised User-Item Based Topic Model for Sentiment Analysis
Li, Fangtao (Google Inc.) | Wang, Sheng (University of Illinois Urbana Champaign) | Liu, Shenghua (Chinese Academy of Sciences) | Zhang, Ming (Peking University)
Probabilistic topic models have been widely used for sentiment analysis. However, most of existing topic methods only model the sentiment text, but do not consider the user, who expresses the sentiment, and the item, which the sentiment is expressed on. Since different users may use different sentiment expressions for different items, we argue that it is better to incorporate the user and item information into the topic model for sentiment analysis. In this paper, we propose a new Supervised User-Item based Topic model, called SUIT model, for sentiment analysis. It can simultaneously utilize the textual topic and latent user-item factors. Our proposed method uses the tensor outer product of text topic proportion vector, user latent factor and item latent factor to model the sentiment label generalization. Extensive experiments are conducted on two datasets: review dataset and microblog dataset. The results demonstrate the advantages of our model. It shows significant improvement compared with supervised topic models and collaborative filtering methods.
Instance-Based Domain Adaptation in NLP via In-Target-Domain Logistic Approximation
Xia, Rui (Nanjing University of Science and Technology) | Yu, Jianfei (Nanjing University of Science and Technology) | Xu, Feng (Nanjing University of Science and Technology) | Wang, Shumei (Nanjing University of Science and Technology)
In the field of NLP, most of the existing domain adaptation studies belong to the feature-based adaptation, while the research of instance-based adaptation is very scarce. In this work, we propose a new instance-based adaptation model, called in-target-domain logistic approximation (ILA). In ILA, we adapt the source-domain data to the target domain by a logistic approximation. The normalized in-target-domain probability is assigned as an instance weight to each of the source-domain training data. An instance-weighted classification model is trained finally for the cross-domain classification problem. Compared to the previous techniques, ILA conducts instance adaptation in a dimensionality-reduced linear feature space to ensure efficiency in high-dimensional NLP tasks. The instance weights in ILA are learnt by leveraging the criteria of both maximum likelihood and minimum statistical distance. The empirical results on two NLP tasks including text categorization and sentiment classification show that our ILA model beats the state-of-the-art instance adaptation methods significantly, in cross-domain classification accuracy, parameter stability and computational efficiency.
Fused Feature Representation Discovery for High-Dimensional and Sparse Data
Suzuki, Jun (NTT Communication Science Laboratories) | Nagata, Masaaki (NTT Communication Science Laboratories)
The automatic discovery of a significant low-dimensional feature representation from a given data set is a fundamental problem in machine learning. This paper focuses specifically on the development of the feature representation discovery methods appropriate for high-dimensional and sparse data. We formulate our feature representation discovery problem as a variant of the semi-supervised learning problem, namely, as an optimization problem over unsupervised data whose objective is evaluating the impact of each feature with respect to modeling a target task according to the initial model constructed by using supervised data. The most notable characteristic of our method is that it offers a feasible processing speed even if the numbers of data and features are both in the millions or even billions, and successfully provides a significantly small number of feature sets, i.e., fewer than 10, that can also offer improved performance compared with those obtained with the original feature sets. We demonstrate the effectiveness of our method in experiments consisting of two well-studied natural language processing tasks.