clique potential
Information Theoretic Properties of Markov Random Fields, and their Algorithmic Applications
Linus Hamilton, Frederic Koehler, Ankur Moitra
Markov random fields are a popular model for high-dimensional probability distributions. Over the years, many mathematical, statistical and algorithmic problems on them have been studied. Until recently, the only known algorithms for provably learning them relied on exhaustive search, correlation decay or various incoherence assumptions. Bresler [4] gave an algorithm for learning general Ising models on bounded degree graphs. His approach was based on a structural result about mutual information in Ising models. Here we take a more conceptual approach to proving lower bounds on the mutual information.
Information Theoretic Properties of Markov Random Fields, and their Algorithmic Applications
Linus Hamilton, Frederic Koehler, Ankur Moitra
Markov random fields are a popular model for high-dimensional probability distributions. Over the years, many mathematical, statistical and algorithmic problems on them have been studied. Until recently, the only known algorithms for provably learning them relied on exhaustive search, correlation decay or various incoherence assumptions. Bresler [4] gave an algorithm for learning general Ising models on bounded degree graphs. His approach was based on a structural result about mutual information in Ising models. Here we take a more conceptual approach to proving lower bounds on the mutual information.
Probabilistic Graphical Models and Tensor Networks: A Hybrid Framework
Miller, Jacob, Roeder, Geoffrey, Bradley, Tai-Danae
We investigate a correspondence between two formalisms for discrete probabilistic modeling: probabilistic graphical models (PGMs) and tensor networks (TNs), a powerful modeling framework for simulating complex quantum systems. The graphical calculus of PGMs and TNs exhibits many similarities, with discrete undirected graphical models (UGMs) being a special case of TNs. However, more general probabilistic TN models such as Born machines (BMs) employ complex-valued hidden states to produce novel forms of correlation among the probabilities. While representing a new modeling resource for capturing structure in discrete probability distributions, this behavior also renders the direct application of standard PGM tools impossible. We aim to bridge this gap by introducing a hybrid PGM-TN formalism that integrates quantum-like correlations into PGM models in a principled manner, using the physically-motivated concept of decoherence. We first prove that applying decoherence to the entirety of a BM model converts it into a discrete UGM, and conversely, that any subgraph of a discrete UGM can be represented as a decohered BM. This method allows a broad family of probabilistic TN models to be encoded as partially decohered BMs, a fact we leverage to combine the representational strengths of both model families. We experimentally verify the performance of such hybrid models in a sequential modeling task, and identify promising uses of our method within the context of existing applications of graphical models.
Uncertainty quantification for Markov Random Fields
Birmpa, Panagiota, Katsoulakis, Markos A.
We present an information-based uncertainty quantification method for general Markov Random Fields. Markov Random Fields (MRFs) are structured, probabilistic graphical models over undirected graphs, and provide a fundamental unifying modeling tool for statistical mechanics, probabilistic machine learning, and artificial intelligence. Typically MRFs are complex and high-dimensional with nodes and edges (connections) built in a modular fashion from simpler, low-dimensional probabilistic models and their local connections; in turn, this modularity allows to incorporate available data to MRFs and efficiently simulate them by leveraging their graph-theoretic structure. Learning graphical models from data and/or constructing them from physical modeling and constraints necessarily involves uncertainties inherited from data, modeling choices, or numerical approximations. These uncertainties in the MRF can be manifested either in the graph structure or the probability distribution functions, and necessarily will propagate in predictions for quantities of interest. Here we quantify such uncertainties using tight, information-based bounds on the predictions of quantities of interest; these bounds take advantage of the graphical structure of MRFs and are capable of handling the inherent high-dimensionality of such graphical models. We demonstrate our methods in MRFs for medical diagnostics and statistical mechanics models. In the latter, we develop uncertainty quantification bounds for finite-size effects and phase diagrams, which constitute two of the typical predictions goals of statistical mechanics modeling.
Scalable and transferable learning of algorithms via graph embedding for multi-robot reward collection
Kang, Hyunwook, Mynbay, Aydar, Morrison, James R., Park, Jinkyoo
Can the success of reinforcement learning methods for combinatorial optimization problems be extended to multi-robot scheduling problems in stochastic contexts? Three issues are particularly important in this context: quality of the resulting decisions, scalability, and transferability. To achieve these ends we generalize the concept of clique potential to stochastic clique potential. We extend a mean field inference fixed point iteration with this new concept and use it to modify thestructure2vec method. We next propose a new reinforcement learning framework combining a graph representation of the problem and a consensus auction inspired by heuristics in the problem domain. This representation enables transferability in terms of the number of robots. Sequential encoding of information through multiple layers of our extended structure2vec results in 96% optimal performance of the learned heuristics. While training tractability is inherited from single robot methods in the literature, use of a multi-robot consensus auction-based relaxation of the maximum operation in the Bellman optimality equation allows for scalable selection of actions in the fitted Q-iteration. We apply our framework to multi-robot reward collection (MRRC) problems in stochastic environments with linear or non-linear rewards. In stochastic environments with non-linear rewards, the new method achieves 20% superior performance relative to the popular sequential greedy assignment (SGA) algorithm. Linear scalability in terms of training is achieved and demonstrated. Transferability is demonstrated by the use of a heuristic trained with three robots that continues to achieve 95% optimal performance when applied to problems with various numbers of robots. We further mention the results obtained when extending the approach to identical parallel machine scheduling(IPMS) problems.
Collective Classification in Network Data
Many real-world applications produce networked data such as the worldwide web (hypertext documents connected through hyperlinks), social networks (such as people connected by friendship links), communication networks (computers connected through communication links), and biological networks (such as protein interaction networks). A recent focus in machine-learning research has been to extend traditional machine-learning classification techniques to classify nodes in such networks. In this article, we provide a brief introduction to this area of research and how it has progressed during the past decade. We introduce four of the most widely used inference algorithms for classifying networked data and empirically compare them on both synthetic and real-world data. Communication networks, financial transaction networks, networks describing physical systems, and social networks are all becoming increasingly important in our day-to-day life.
Information Theoretic Properties of Markov Random Fields, and their Algorithmic Applications
Hamilton, Linus, Koehler, Frederic, Moitra, Ankur
Markov random fields are a popular model for high-dimensional probability distributions. Over the years, many mathematical, statistical and algorithmic problems on them have been studied. Until recently, the only known algorithms for provably learning them relied on exhaustive search, correlation decay or various incoherence assumptions. Bresler gave an algorithm for learning general Ising models on bounded degree graphs. His approach was based on a structural result about mutual information in Ising models. Here we take a more conceptual approach to proving lower bounds on the mutual information. Our proof generalizes well beyond Ising models, to arbitrary Markov random fields with higher order interactions. As an application, we obtain algorithms for learning Markov random fields on bounded degree graphs on $n$ nodes with $r$-order interactions in $n^r$ time and $\log n$ sample complexity. Our algorithms also extend to various partial observation models.
Duality of Graphical Models and Tensor Networks
In this article we show the duality between tensor networks and undirected graphical models with discrete variables. We study tensor networks on hypergraphs, which we call tensor hypernetworks. We show that the tensor hypernetwork on a hypergraph exactly corresponds to the graphical model given by the dual hypergraph. We translate various notions under duality. For example, marginalization in a graphical model is dual to contraction in the tensor network. Algorithms also translate under duality. We show that belief propagation corresponds to a known algorithm for tensor network contraction. This article is a reminder that the research areas of graphical models and tensor networks can benefit from interaction.
Multiple Instance Learning by Discriminative Training of Markov Networks
Hajimirsadeghi, Hossein, Li, Jinling, Mori, Greg, Zaki, Mohammad, Sayed, Tarek
We introduce a graphical framework for multiple instance learning (MIL) based on Markov networks. This framework can be used to model the traditional MIL definition as well as more general MIL definitions. Different levels of ambiguity -- the portion of positive instances in a bag -- can be explored in weakly supervised data. To train these models, we propose a discriminative max-margin learning algorithm leveraging efficient inference for cardinality-based cliques. The efficacy of the proposed framework is evaluated on a variety of data sets. Experimental results verify that encoding or learning the degree of ambiguity can improve classification performance.
A General Algorithm for Approximate Inference and its Application to Hybrid Bayes Nets
Koller, Daphne, Lerner, Uri, Anguelov, Dragomir
The clique tree algorithm is the standard method for doing inference in Bayesian networks. It works by manipulating clique potentials - distributions over the variables in a clique. While this approach works well for many networks, it is limited by the need to maintain an exact representation of the clique potentials. This paper presents a new unified approach that combines approximate inference and the clique tree algorithm, thereby circumventing this limitation. Many known approximate inference algorithms can be viewed as instances of this approach. The algorithm essentially does clique tree propagation, using approximate inference to estimate the densities in each clique. In many settings, the computation of the approximate clique potential can be done easily using statistical importance sampling. Iterations are used to gradually improve the quality of the estimation.