undirected graphical model
Markov locality and relating it to p locality
To gain intuition for how p-locality functions, we will introduce another notion of locality, called Markov locality, which will use the language of Markov blankets. We will prove that under relatively relaxed conditions p-locality and Markov locality are equivalent. This will allow us to relate the notion of locality to various graph structures commonly used to represent probability distributions, and will be a key step in proving Properties 2.1 and 2.2. We start by defining the Markov boundary, M(X,S), of a random variable X contained in a set of random variables S, as a minimal set such that p(X|S) = p(X|M(X,S)). The Markov boundary defines a minimal set of variables such that, conditioned on these variables, conditioning on no additional random variables in S changes the probability of X [39]. Similarly, we define the Markov blanket, M(X,S) for X in S as any set of variables such that conditioning on M(X,S), makes X conditionally independent from all other variables [39]. In this way, the Markov boundary is a Markov blanket but not all blankets are boundaries. Markov locality: Given probability distribution p(Z) and function f: RNX+Nฮ RNฮ, the update function f(Z) is Markov-local with respect to the distribution p over Z if and only if k: Z โฆs.t. AMarkov boundary can be thought of as the set of variables that'locally' communicate with the parameter ฮk, thus providing a natural measure of locality. Importantly, for Markov-locality to be of use, we would like the Markov boundaries of random variables in the model of interest to be unique.
A Defining Markov locality and relating it to p locality
Markov locality, which will use the language of Markov blankets. Markov blanket but not all blankets are boundaries. A Markov boundary can be thought of as the set of variables that'locally' communicate with the parameter Importantly, for Markov-locality to be of use, we would like the Markov boundaries of random variables in the model of interest to be unique. Assume all quantities are as in A.1, that the conditional independence relationships This proof relies on Lemma A.1, proved below. We wish to prove Eq. 2 Eq.
Neural Variational Inference and Learning in Undirected Graphical Models
Many problems in machine learning are naturally expressed in the language of undirected graphical models. Here, we propose black-box learning and inference algorithms for undirected models that optimize a variational approximation to the log-likelihood of the model. Central to our approach is an upper bound on the log-partition function parametrized by a function q that we express as a flexible neural network. Our bound makes it possible to track the partition function during learning, to speed-up sampling, and to train a broad class of hybrid directed/undirected models via a unified variational inference framework. We empirically demonstrate the effectiveness of our method on several popular generative modeling datasets.
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper proposes a novel approach to human pose estimation, consisting of a deep convolutional network for part detection and a higher-level spatial model that is motivated as a graphical model, but actually incorporated into the overall deep network as a particular sub-net that has the plausible interpretation of performing a single round of message passing. The system is trained in three steps. In the first two steps, the deep convolutional part detector and the spatial model are trained individually (the spatial message passing network uses the heat map output of the part detector), while in the third step, the unified network is jointly trained via back propagation. Even though the convolutional part detector alone is already a state-of-the-art system, the spatial model is shown to improve results considerably, with even further improvements gained via the joint training procedure.
Learning Large-Scale Poisson DAG Models based on OverDispersion Scoring
Gunwoong Park, Garvesh Raskutti
In this paper, we address the question of identifiability and learning algorithms for large-scale Poisson Directed Acyclic Graphical (DAG) models. We define general Poisson DAG models as models where each node is a Poisson random variable with rate parameter depending on the values of the parents in the underlying DAG. First, we prove that Poisson DAG models are identifiable from observational data, and present a polynomial-time algorithm that learns the Poisson DAG model under suitable regularity conditions. The main idea behind our algorithm is based on overdispersion, in that variables that are conditionally Poisson are overdispersed relative to variables that are marginally Poisson.
Bayesian Estimation of Latently-grouped Parameters in Undirected Graphical Models
In large-scale applications of undirected graphical models, such as social networks and biological networks, similar patterns occur frequently and give rise to similar parameters. In this situation, it is beneficial to group the parameters for more efficient learning. We show that even when the grouping is unknown, we can infer these parameter groups during learning via a Bayesian approach. We impose a Dirichlet process prior on the parameters. Posterior inference usually involves calculating intractable terms, and we propose two approximation algorithms, namely a Metropolis-Hastings algorithm with auxiliary variables and a Gibbs sampling algorithm with stripped Beta approximation (Gibbs SBA's performance is close to Gibbs sampling with exact likelihood calculation. Models learned with Gibbs_SBA also generalize better than the models learned by MLE on real-world Senate voting data.