Collaborating Authors

Negative Tree Reweighted Belief Propagation Machine Learning

We introduce a new class of lower bounds on the log partition function of a Markov random field which makes use of a reversed Jensen's inequality. In particular, our method approximates the intractable distribution using a linear combination of spanning trees with negative weights. This technique is a lower-bound counterpart to the tree-reweighted belief propagation algorithm, which uses a convex combination of spanning trees with positive weights to provide corresponding upper bounds. We develop algorithms to optimize and tighten the lower bounds over the non-convex set of valid parameter values. Our algorithm generalizes mean field approaches (including naive and structured mean field approximations), which it includes as a limiting case.

Learning Correlated Latent Representations with Adaptive Priors Machine Learning

Variational Auto-Encoders (VAEs) have been widely applied for learning compact low-dimensional latent representations for high-dimensional data. When the correlation structure among data points is available, previous work proposed Correlated Variational Auto-Encoders (CVAEs) which employ a structured mixture model as prior and a structured variational posterior for each mixture component to enforce the learned latent representations to follow the same correlation structure. However, as we demonstrate in this paper, such a choice can not guarantee that CVAEs can capture all of the correlations. Furthermore, it prevents us from obtaining a tractable joint and marginal variational distribution. To address these issues, we propose Adaptive Correlated Variational Auto-Encoders (ACVAEs), which apply an adaptive prior distribution that can be adjusted during training, and learn a tractable joint distribution via a saddle-point optimization procedure. Its tractable form also enables further refinement with belief propagation. Experimental results on two real datasets show that ACVAEs outperform other benchmarks significantly.

A Generalized Mean Field Algorithm for Variational Inference in Exponential Families Machine Learning

The mean field methods, which entail approximating intractable probability distributions variationally with distributions from a tractable family, enjoy high efficiency, guaranteed convergence, and provide lower bounds on the true likelihood. But due to requirement for model-specific derivation of the optimization equations and unclear inference quality in various models, it is not widely used as a generic approximate inference algorithm. In this paper, we discuss a generalized mean field theory on variational approximation to a broad class of intractable distributions using a rich set of tractable distributions via constrained optimization over distribution spaces. We present a class of generalized mean field (GMF) algorithms for approximate inference in complex exponential family models, which entails limiting the optimization over the class of cluster-factorizable distributions. GMF is a generic method requiring no model-specific derivations. It factors a complex model into a set of disjoint variable clusters, and uses a set of canonical fix-point equations to iteratively update the cluster distributions, and converge to locally optimal cluster marginals that preserve the original dependency structure within each cluster, hence, fully decomposed the overall inference problem. We empirically analyzed the effect of different tractable family (clusters of different granularity) on inference quality, and compared GMF with BP on several canonical models. Possible extension to higher-order MF approximation is also discussed.

Bayesian Pedigree Analysis using Measure Factorization

Neural Information Processing Systems

Pedigrees, or family trees, are directed graphs used to identify sites of the genome that are correlated with the presence or absence of a disease. With the advent of genotyping and sequencing technologies, there has been an explosion in the amount of data available, both in the number of individuals and in the number of sites. Some pedigrees number in the thousands of individuals. Meanwhile, analysis methods have remained limited to pedigrees of <100 individuals which limits analyses to many small independent pedigrees. Disease models, such those used for the linkage analysis log-odds (LOD) estimator, have similarly been limited. This is because linkage anlysis was originally designed with a different task in mind, that of ordering the sites in the genome, before there were technologies that could reveal the order. LODs are difficult to interpret and nontrivial to extend to consider interactions among sites. These developments and difficulties call for the creation of modern methods of pedigree analysis. Drawing from recent advances in graphical model inference and transducer theory, we introduce a simple yet powerful formalism for expressing genetic disease models. We show that these disease models can be turned into accurate and efficient estimators. The technique we use for constructing the variational approximation has potential applications to inference in other large-scale graphical models. This method allows inference on larger pedigrees than previously analyzed in the literature, which improves disease site prediction.

Correlated Variational Auto-Encoders Machine Learning

Variational Auto-Encoders (VAEs) are capable of learning latent representations for high dimensional data. However, due to the i.i.d. assumption, VAEs only optimize the singleton variational distributions and fail to account for the correlations between data points, which might be crucial for learning latent representations from dataset where a priori we know correlations exist. We propose Correlated Variational Auto-Encoders (CVAEs) that can take the correlation structure into consideration when learning latent representations with VAEs. CVAEs apply a prior based on the correlation structure. To address the intractability introduced by the correlated prior, we develop an approximation by average of a set of tractable lower bounds over all maximal acyclic subgraphs of the undirected correlation graph. Experimental results on matching and link prediction on public benchmark rating datasets and spectral clustering on a synthetic dataset show the effectiveness of the proposed method over baseline algorithms.