AITopics | Undirected Networks

Collaborating Authors

Undirected Networks

News Overviews Instructional Materials AI-Alerts Classics

Linear State-Space Model with Time-Varying Dynamics

Luttinen, Jaakko, Raiko, Tapani, Ilin, Alexander

arXiv.org Machine LearningOct-3-2014

This paper introduces a linear state-space model with time-varying dynamics. The time dependency is obtained by forming the state dynamics matrix as a time-varying linear combination of a set of matrices. The time dependency of the weights in the linear combination is modelled by another linear Gaussian dynamical model allowing the model to learn how the dynamics of the process changes. Previous approaches have used switching models which have a small set of possible state dynamics matrices and the model selects one of those matrices at each time, thus jumping between them. Our model forms the dynamics as a linear combination and the changes can be smooth and more continuous. The model is motivated by physical processes which are described by linear partial differential equations whose parameters vary in time. An example of such a process could be a temperature field whose evolution is driven by a varying wind direction. The posterior inference is performed using variational Bayesian approximation. The experiments on stochastic advection-diffusion processes and real-world weather processes show that the model with time-varying dynamics can outperform previously introduced approaches.

artificial intelligence, machine learning, matrix, (17 more...)

arXiv.org Machine Learning

doi: 10.1007/978-3-662-44851-9_22

1410.0555

Country: Europe (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)

Add feedback

Mapping Energy Landscapes of Non-Convex Learning Problems

Pavlovskaia, Maria, Tu, Kewei, Zhu, Song-Chun

arXiv.org Machine LearningOct-2-2014

In many statistical learning problems, the target functions to be optimized are highly non-convex in various model spaces and thus are difficult to analyze. In this paper, we compute Energy Landscape Maps (ELMs) which characterize and visualize an energy function with a tree structure, in which each leaf node represents a local minimum and each non-leaf node represents the barrier between adjacent energy basins. The ELM also associates each node with the estimated probability mass and volume for the corresponding energy basin. We construct ELMs by adopting the generalized Wang-Landau algorithm and multidomain sampler that simulates a Markov chain traversing the model space by dynamically reweighting the energy function. We construct ELMs in the model space for two classic statistical learning problems: i) clustering with Gaussian mixture models or Bernoulli templates; and ii) bi-clustering. We propose a way to measure the difficulties (or complexity) of these learning problems and study how various conditions affect the landscape complexity, such as separability of the clusters, the number of examples, and the level of supervision; and we also visualize the behaviors of different algorithms, such as K-mean, EM, two-step EM and Swendsen-Wang cuts, in the energy landscapes. Key words and phrases: Non-convex Optimization, Visualization, Clustering, Bi-clustering, Markov chain Monte Carlo. 1. INTRODUCTION In many statistical learning problems, the energy functions to be optimized are highly non-convex.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

1410.0576

Country: North America > United States > California > Los Angeles County > Los Angeles (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Education > Focused Education > Special Education (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Deep Tempering

Desjardins, Guillaume, Luo, Heng, Courville, Aaron, Bengio, Yoshua

arXiv.org Machine LearningOct-1-2014

Restricted Boltzmann Machines (RBMs) are one of the fundamental building blocks of deep learning. Approximate maximum likelihood training of RBMs typically necessitates sampling from these models. In many training scenarios, computationally efficient Gibbs sampling procedures are crippled by poor mixing. In this work we propose a novel method of sampling from Boltzmann machines that demonstrates a computationally efficient way to promote mixing. Our approach leverages an under-appreciated property of deep generative models such as the Deep Belief Network (DBN), where Gibbs sampling from deeper levels of the latent variable hierarchy results in dramatically increased ergodicity. Our approach is thus to train an auxiliary latent hierarchical model, based on the DBN. When used in conjunction with parallel-tempering, the method is asymptotically guaranteed to simulate samples from the target RBM. Experimental results confirm the effectiveness of this sampling strategy in the context of RBM training.

artificial intelligence, machine learning, rbm, (17 more...)

arXiv.org Machine Learning

1410.0123

Country: Europe (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.72)

Add feedback

A Hierarchical Approach to Generating Maps Using Markov Chains

Snodgrass, Sam (Drexel University) | Ontanon, Santiago (Drexel University)

AAAI ConferencesSep-29-2014

In this paper we describe a hierarchical method for procedurallygenerating maps using Markov chains. Ourmethod takes as input a collection of human-authoredtwo-dimensional maps, and splits them into high-leveltiles which capture large structures. Markov chains arethen learned from those maps to capture the structure ofboth the high-level tiles, as well as the low-level tiles.Then, the learned Markov chains are used to generatenew maps by first generating the high-level structure ofthe map using high-level tiles, and then generating thelow-level layout of the map. We validate our approachusing the game Super Mario Bros., by evaluating thequality of maps produced using different configurationsfor training and generation.

artificial intelligence, hierarchical approach, machine learning, (2 more...)

AAAI Conferences

Tenth Artificial Intelligence and Interactive Digital Entertainment Conference

Industry: Leisure & Entertainment > Games > Computer Games (0.53)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Deep Learning-Based Goal Recognition in Open-Ended Digital Games

Min, Wookhee (North Carolina State University) | Ha, Eun Young (North Carolina State University) | Rowe, Jonathan (North Carolina State University) | Mott, Bradford (North Carolina State University) | Lester, James (North Carolina State University)

AAAI ConferencesSep-29-2014

While many open-ended digital games feature non-linear storylines and multiple solution paths, it is challenging for game developers to create effective game experiences in these settings due to the freedom given to the player. To address these challenges, goal recognition, a computational player-modeling task, has been investigated to enable digital games to dynamically predict players’ goals. This paper presents a goal recognition framework based on stacked denoising autoencoders, a variant of deep learning. The learned goal recognition models, which are trained from a corpus of player interactions, not only offer improved performance, but also offer the substantial advantage of eliminating the need for labor-intensive feature engineering. An evaluation demonstrates that the deep learning-based goal recognition framework significantly outperforms the previous state-of-the-art goal recognition approach based on Markov logic networks.

artificial intelligence, belief revision, machine learning, (2 more...)

AAAI Conferences

Tenth Artificial Intelligence and Interactive Digital Entertainment Conference

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.53)

Add feedback

Simple Regret Optimization in Online Planning for Markov Decision Processes

Feldman, Z., Domshlak, C.

Journal of Artificial Intelligence ResearchSep-25-2014

We consider online planning in Markov decision processes (MDPs). In online planning, the agent focuses on its current state only, deliberates about the set of possible policies from that state onwards and, when interrupted, uses the outcome of that exploratory deliberation to choose what action to perform next. Formally, the performance of algorithms for online planning is assessed in terms of simple regret, the agent's expected performance loss when the chosen action, rather than an optimal one, is followed. To date, state-of-the-art algorithms for online planning in general MDPs are either best effort, or guarantee only polynomial-rate reduction of simple regret over time. Here we introduce a new Monte-Carlo tree search algorithm, BRUE, that guarantees exponential-rate and smooth reduction of simple regret. At a high level, BRUE is based on a simple yet non-standard state-space sampling scheme, MCTS2e, in which different parts of each sample are dedicated to different exploratory objectives. We further extend BRUE with a variant of ``learning by forgetting.'' The resulting parametrized algorithm, BRUE(alpha), exhibits even more attractive formal guarantees than BRUE. Our empirical evaluation shows that both BRUE and its generalization, BRUE(alpha), are also very effective in practice and compare favorably to the state-of-the-art.

algorithm, brue, simple regret optimization, (13 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.4432

AI Access Foundation

10905

Journal of Artificial Intelligence Research

Country:

North America > United States > New York (0.04)
North America > United States > California (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(3 more...)

Industry: Leisure & Entertainment > Games (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.70)

Add feedback

Tight Error Bounds for Structured Prediction

Globerson, Amir, Roughgarden, Tim, Sontag, David, Yildirim, Cafer

arXiv.org Machine LearningSep-19-2014

Structured prediction tasks in machine learning involve the simultaneous prediction of multiple labels. This is typically done by maximizing a score function on the space of labels, which decomposes as a sum of pairwise elements, each depending on two specific labels. Intuitively, the more pairwise terms are used, the better the expected accuracy. However, there is currently no theoretical account of this intuition. This paper takes a significant step in this direction. We formulate the problem as classifying the vertices of a known graph $G=(V,E)$, where the vertices and edges of the graph are labelled and correlate semi-randomly with the ground truth. We show that the prospects for achieving low expected Hamming error depend on the structure of the graph $G$ in interesting ways. For example, if $G$ is a very poor expander, like a path, then large expected Hamming error is inevitable. Our main positive result shows that, for a wide class of graphs including 2D grid graphs common in machine vision applications, there is a polynomial-time algorithm with small and information-theoretically near-optimal expected error. Our results provide a first step toward a theoretical justification for the empirical success of the efficient approximate inference algorithms that are used for structured prediction in models where exact inference is intractable.

artificial intelligence, inductive learning, machine learning, (18 more...)

arXiv.org Machine Learning

1409.5834

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.91)
(2 more...)

Add feedback

Particle Metropolis-Hastings using gradient and Hessian information

Dahlin, Johan, Lindsten, Fredrik, Schön, Thomas B.

arXiv.org Machine LearningSep-18-2014

Particle Metropolis-Hastings (PMH) allows for Bayesian parameter inference in nonlinear state space models by combining Markov chain Monte Carlo (MCMC) and particle filtering. The latter is used to estimate the intractable likelihood. In its original formulation, PMH makes use of a marginal MCMC proposal for the parameters, typically a Gaussian random walk. However, this can lead to a poor exploration of the parameter space and an inefficient use of the generated particles. We propose a number of alternative versions of PMH that incorporate gradient and Hessian information about the posterior into the proposal. This information is more or less obtained as a byproduct of the likelihood estimation. Indeed, we show how to estimate the required information using a fixed-lag particle smoother, with a computational cost growing linearly in the number of particles. We conclude that the proposed methods can: (i) decrease the length of the burn-in phase, (ii) increase the mixing of the Markov chain at the stationary phase, and (iii) make the proposal distribution scale invariant which simplifies tuning.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1007/s11222-014-9510-0

1311.0686

Country:

Europe > Sweden (0.28)
North America > United States (0.28)
Europe > United Kingdom > England (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Model-based Kernel Sum Rule

Nishiyama, Yu, Kanagawa, Motonobu, Gretton, Arthur, Fukumizu, Kenji

arXiv.org Machine LearningSep-17-2014

In this study, we enrich the framework of nonparametric kernel Bayesian inference via the flexible incorporation of certain probabilistic models, such as additive Gaussian noise models. Nonparametric inference expressed in terms of kernel means, which is called kernel Bayesian inference, has been studied using basic rules such as the kernel sum rule (KSR), kernel chain rule, kernel product rule, and kernel Bayes' rule (KBR). However, the current framework used for kernel Bayesian inference deals only with nonparametric inference and it cannot allow inference when combined with probabilistic models. In this study, we introduce a novel KSR, called model-based KSR (Mb-KSR), which exploits the knowledge obtained from some probabilistic models of conditional distributions. The incorporation of Mb-KSR into nonparametric kernel Bayesian inference facilitates more flexible kernel Bayesian inference than nonparametric inference. We focus on combinations of Mb-KSR, Non-KSR, and KBR, and we propose a filtering algorithm for state space models, which combines nonparametric learning of the observation process using kernel means and additive Gaussian noise models of the transition dynamics. The idea of the Mb-KSR for additive Gaussian noise models can be extended to more general noise model cases, including a conjugate pair with a positive-definite kernel and a probabilistic model.

artificial intelligence, machine learning, probabilistic model, (16 more...)

arXiv.org Machine Learning

1409.5178

Country: Asia > Japan (0.14)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Hardness of parameter estimation in graphical models

Bresler, Guy, Gamarnik, David, Shah, Devavrat

arXiv.org Artificial IntelligenceSep-17-2014

We consider the problem of learning the canonical parameters specifying an undirected graphical model (Markov random field) from the mean parameters. For graphical models representing a minimal exponential family, the canonical parameters are uniquely determined by the mean parameters, so the problem is feasible in principle. The goal of this paper is to investigate the computational feasibility of this statistical task. Our main result shows that parameter estimation is in general intractable: no algorithm can learn the canonical parameters of a generic pair-wise binary graphical model from the mean parameters in time bounded by a polynomial in the number of variables (unless RP = NP). Indeed, such a result has been believed to be true (see the monograph by Wainwright and Jordan (2008)) but no proof was known. Our proof gives a polynomial time reduction from approximating the partition function of the hard-core model, known to be hard, to learning approximate parameters. Our reduction entails showing that the marginal polytope boundary has an inherent repulsive property, which validates an optimization procedure over the polytope that does not use any knowledge of its structure (as required by the ellipsoid method and others).

artificial intelligence, hardcore model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

1409.3836

Country: Asia > Middle East > Jordan (0.24)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback