Goto

Collaborating Authors

 Learning Graphical Models


Deep Learning-Based Goal Recognition in Open-Ended Digital Games

AAAI Conferences

While many open-ended digital games feature non-linear storylines and multiple solution paths, it is challenging for game developers to create effective game experiences in these settings due to the freedom given to the player. To address these challenges, goal recognition, a computational player-modeling task, has been investigated to enable digital games to dynamically predict players’ goals. This paper presents a goal recognition framework based on stacked denoising autoencoders, a variant of deep learning. The learned goal recognition models, which are trained from a corpus of player interactions, not only offer improved performance, but also offer the substantial advantage of eliminating the need for labor-intensive feature engineering. An evaluation demonstrates that the deep learning-based goal recognition framework significantly outperforms the previous state-of-the-art goal recognition approach based on Markov logic networks.


A Bayesian Tensor Factorization Model via Variational Inference for Link Prediction

arXiv.org Machine Learning

Probabilistic approaches for tensor factorization aim to extract meaningful structure from incomplete data by postulating low rank constraints. Recently, variational Bayesian (VB) inference techniques have successfully been applied to large scale models. This paper presents full Bayesian inference via VB on both single and coupled tensor factorization models. Our method can be run even for very large models and is easily implemented. It exhibits better prediction performance than existing approaches based on maximum likelihood on several real-world datasets for missing link prediction problem.


Variational Inference in Sparse Gaussian Process Regression and Latent Variable Models - a Gentle Tutorial

arXiv.org Machine Learning

In this tutorial we explain the inference procedures developed for the sparse Gaussian process (GP) regression and Gaussian process latent variable model (GPLVM). Due to page limit the derivation given in Titsias (2009) and Titsias & Lawrence (2010) is brief, hence getting a full picture of it requires collecting results from several different sources and a substantial amount of algebra to fill-in the gaps. Our main goal is thus to collect all the results and full derivations into one place to help speed up understanding this work. In doing so we present a re-parametrisation of the inference that allows it to be carried out in parallel. A secondary goal for this document is, therefore, to accompany our paper and open-source implementation of the parallel inference scheme for the models. We hope that this document will bridge the gap between the equations as implemented in code and those published in the original papers, in order to make it easier to extend existing work. We assume prior knowledge of Gaussian processes and variational inference, but we also include references for further reading where appropriate.


The automatic creation of concept maps from documents written using morphologically rich languages

arXiv.org Artificial Intelligence

Concept map is a graphical tool for representing knowledge. They have been used in many different areas, including education, knowledge management, business and intelligence. Constructing of concept maps manually can be a complex task; an unskilled person may encounter difficulties in determining and positioning concepts relevant to the problem area. An application that recommends concept candidates and their position in a concept map can significantly help the user in that situation. This paper gives an overview of different approaches to automatic and semi-automatic creation of concept maps from textual and non-textual sources. The concept map mining process is defined, and one method suitable for the creation of concept maps from unstructured textual sources in highly inflected languages such as the Croatian language is described in detail. Proposed method uses statistical and data mining techniques enriched with linguistic tools. With minor adjustments, that method can also be used for concept map mining from textual sources in other morphologically rich languages.


Order-invariant prior specification in Bayesian factor analysis

arXiv.org Machine Learning

In (exploratory) factor analysis, the loading matrix is identified only up to orthogonal rotation. For identifiability, one thus often takes the loading matrix to be lower triangular with positive diagonal entries. In Bayesian inference, a standard practice is then to specify a prior under which the loadings are independent, the off-diagonal loadings are normally distributed, and the diagonal loadings follow a truncated normal distribution. This prior specification, however, depends in an important way on how the variables and associated rows of the loading matrix are ordered. We show how a minor modification of the approach allows one to compute with the identifiable lower triangular loading matrix but maintain invariance properties under reordering of the variables.


Simple Regret Optimization in Online Planning for Markov Decision Processes

Journal of Artificial Intelligence Research

We consider online planning in Markov decision processes (MDPs). In online planning, the agent focuses on its current state only, deliberates about the set of possible policies from that state onwards and, when interrupted, uses the outcome of that exploratory deliberation to choose what action to perform next. Formally, the performance of algorithms for online planning is assessed in terms of simple regret, the agent's expected performance loss when the chosen action, rather than an optimal one, is followed. To date, state-of-the-art algorithms for online planning in general MDPs are either best effort, or guarantee only polynomial-rate reduction of simple regret over time. Here we introduce a new Monte-Carlo tree search algorithm, BRUE, that guarantees exponential-rate and smooth reduction of simple regret. At a high level, BRUE is based on a simple yet non-standard state-space sampling scheme, MCTS2e, in which different parts of each sample are dedicated to different exploratory objectives. We further extend BRUE with a variant of ``learning by forgetting.'' The resulting parametrized algorithm, BRUE(alpha), exhibits even more attractive formal guarantees than BRUE. Our empirical evaluation shows that both BRUE and its generalization, BRUE(alpha), are also very effective in practice and compare favorably to the state-of-the-art.


Beyond Maximum Likelihood: from Theory to Practice

arXiv.org Machine Learning

Maximum likelihood is the most widely used statistical estimation technique. Recent work by Jiao, Venkat, Han, and Weissman [1] introduced a general methodology for the construction of estimators for functionals in parametric models, and demonstrated improvements - both in theory and in practice - over the maximum likelihood estimator (MLE), particularly in high dimensional scenarios involving parameter dimension comparable to or larger than the number of samples. This approach to estimation, building on results from approximation theory, is shown to yield minimax rate-optimal estimators for a wide class of functionals, implementable with modest computational requirements. In a nutshell, a message of this recent work is that, for a wide class of functionals, the performance of these essentially optimal estimators with n samples is comparable to that of the MLE with nlnn samples. In the present paper, we highlight the applicability of the aforementioned methodology to statistical problems beyond functional estimation, and show that it can yield substantial gains. For example, we demonstrate that for learning tree-structured graphical models, our approach achieves a significant reduction of the required data size compared with the classical Chow-Liu algorithm, which is an implementation of the MLE, to achieve the same accuracy. The key step in improving the Chow-Liu algorithm is to replace the empirical mutual information with the estimator for mutual information proposed in [1]. Further, applying the same replacement approach to classical Bayesian network classification, the resulting classifiers uniformly outperform the previous classifiers on 26 widely used datasets.


Identification of jump Markov linear models using particle filters

arXiv.org Machine Learning

Jump Markov linear models consists of a finite number of linear state space models and a discrete variable encoding the jumps (or switches) between the different linear models. Identifying jump Markov linear models makes for a challenging problem lacking an analytical solution. We derive a new expectation maximization (EM) type algorithm that produce maximum likelihood estimates of the model parameters. Our development hinges upon recent progress in combining particle filters with Markov chain Monte Carlo methods in solving the nonlinear state smoothing problem inherent in the EM formulation. Key to our development is that we exploit a conditionally linear Gaussian substructure in the model, allowing for an efficient algorithm.


Unsupervised learning of regression mixture models with unknown number of components

arXiv.org Machine Learning

Regression mixture models are widely studied in statistics, machine learning and data analysis. Fitting regression mixtures is challenging and is usually performed by maximum likelihood by using the expectation-maximization (EM) algorithm. However, it is well-known that the initialization is crucial for EM. If the initialization is inappropriately performed, the EM algorithm may lead to unsatisfactory results. The EM algorithm also requires the number of clusters to be given a priori; the problem of selecting the number of mixture components requires using model selection criteria to choose one from a set of pre-estimated candidate models. We propose a new fully unsupervised algorithm to learn regression mixture models with unknown number of components. The developed unsupervised learning approach consists in a penalized maximum likelihood estimation carried out by a robust expectation-maximization (EM) algorithm for fitting polynomial, spline and B-spline regressions mixtures. The proposed learning approach is fully unsupervised: 1) it simultaneously infers the model parameters and the optimal number of the regression mixture components from the data as the learning proceeds, rather than in a two-fold scheme as in standard model-based clustering using afterward model selection criteria, and 2) it does not require accurate initialization unlike the standard EM for regression mixtures. The developed approach is applied to curve clustering problems. Numerical experiments on simulated data show that the proposed robust EM algorithm performs well and provides accurate results in terms of robustness with regard initialization and retrieving the optimal partition with the actual number of clusters. An application to real data in the framework of functional data clustering, confirms the benefit of the proposed approach for practical applications.


On tensor rank of conditional probability tables in Bayesian networks

arXiv.org Artificial Intelligence

A difficult task in modeling with Bayesian networks is the elicitation of numerical parameters of Bayesian networks. A large number of parameters is needed to specify a conditional probability table (CPT) that has a larger parent set. In this paper we show that, most CPTs from real applications of Bayesian networks can actually be very well approximated by tables that require substantially less parameters. This observation has practical consequence not only for model elicitation but also for efficient probabilistic reasoning with these networks.