A variety of lifted inference algorithms, which exploit model symmetry to reduce computational cost, have been proposed to render inference tractable in probabilistic relational models. Most existing lifted inference algorithms operate only over discrete domains or continuous domains with restricted potential functions, e.g., Gaussian. We investigate two approximate lifted variational approaches that are applicable to hybrid domains and expressive enough to capture multi-modality. We demonstrate that the proposed variational methods are both scalable and can take advantage of approximate model symmetries, even in the presence of a large amount of continuous evidence. We demonstrate that our approach compares favorably against existing message-passing based approaches in a variety of settings. Finally, we present a sufficient condition for the Bethe approximation to yield a non-trivial estimate over the marginal polytope.
The mean field methods, which entail approximating intractable probability distributions variationally with distributions from a tractable family, enjoy high efficiency, guaranteed convergence, and provide lower bounds on the true likelihood. But due to requirement for model-specific derivation of the optimization equations and unclear inference quality in various models, it is not widely used as a generic approximate inference algorithm. In this paper, we discuss a generalized mean field theory on variational approximation to a broad class of intractable distributions using a rich set of tractable distributions via constrained optimization over distribution spaces. We present a class of generalized mean field (GMF) algorithms for approximate inference in complex exponential family models, which entails limiting the optimization over the class of cluster-factorizable distributions. GMF is a generic method requiring no model-specific derivations. It factors a complex model into a set of disjoint variable clusters, and uses a set of canonical fix-point equations to iteratively update the cluster distributions, and converge to locally optimal cluster marginals that preserve the original dependency structure within each cluster, hence, fully decomposed the overall inference problem. We empirically analyzed the effect of different tractable family (clusters of different granularity) on inference quality, and compared GMF with BP on several canonical models. Possible extension to higher-order MF approximation is also discussed.
By handling whole sets of indistinguishable objects together, lifted belief propagation approaches have rendered large, previously intractable, probabilistic inference problems quickly solvable. In this paper, we show that Kumar and Zilberstein's likelihood maximization (LM) approach to MAP inference is liftable, too, and actually provides additional structure for optimization. Specifically, it has been recognized that some pseudo marginals may converge quickly, turning intuitively into pseudo evidence. This additional evidence typically changes the structure of the lifted network: it may expand or reduce it. The current lifted network, however, can be viewed as an upper bound on the size of the lifted network required to finish likelihood maximization. Consequently, we re-lift the network only if the pseudo evidence yields a reduced network, which can efficiently be computed on the current lifted network. Our experimental results on Ising models, image segmentation and relational entity resolution demonstrate that this bootstrapped LM via "reduce and re-lift" finds MAP assignments comparable to those found by the original LM approach, but in a fraction of the time.
We propose to learn deep undirected graphical models (i.e., MRFs), with a non-ELBO objective for which we can calculate exact gradients. In particular, we optimize a saddle-point objective deriving from the Bethe free energy approximation to the partition function. Unlike much recent work in approximate inference, the derived objective requires no sampling, and can be efficiently computed even for very expressive MRFs. We furthermore amortize this optimization with trained inference networks. Experimentally, we find that the proposed approach compares favorably with loopy belief propagation, but is faster, and it allows for attaining better held out log likelihood than other recent approximate inference schemes.
Inference in general Ising models is difficult, due to high treewidth making tree-based algorithms intractable. Moreover, when interactions are strong, Gibbs sampling may take exponential time to converge to the stationary distribution. We present an algorithm to project Ising model parameters onto a parameter set that is guaranteed to be fast mixing, under several divergences. We find that Gibbs sampling using the projected parameters is more accurate than with the original parameters when interaction strengths are strong and when limited time is available for sampling.