Goto

Collaborating Authors

 probability propagation


A Revolution: Belief Propagation in Graphs with Cycles

Neural Information Processing Systems

Until recently, artificial intelligence researchers have frowned upon the application of probability propagation in Bayesian belief net(cid:173) works that have cycles. The probability propagation algorithm is only exact in networks that are cycle-free. However, it has recently been discovered that the two best error-correcting decoding algo(cid:173) rithms are actually performing probability propagation in belief networks with cycles. Our increasingly wired world demands efficient methods for communicating bits of information over physical channels that introduce errors. Examples of real-world channels include twisted-pair telephone wires, shielded cable-TV wire, fiber-optic cable, deep-space radio, terrestrial radio, and indoor radio.


Evidence Absorption and Propagation through Evidence Reversals

Shachter, Ross D.

arXiv.org Artificial Intelligence

The arc reversal/node reduction approach to probabilistic inference is extended to include the case of instantiated evidence by an operation called "evidence reversal." This not only provides a technique for computing posterior joint distributions on general belief networks, but also provides insight into the methods of Pearl [1986b] and Lauritzen and Spiegelhalter [1988]. Although it is well understood that the latter two algorithms are closely related, in fact all three algorithms are identical whenever the belief network is a forest.


Investigation of Variances in Belief Networks

Neapolitan, Richard E., Kenevan, James

arXiv.org Artificial Intelligence

The belief network is a well-known graphical structure for representing independences in a joint probability distribution. The methods, which perform probabilistic inference in belief networks, often treat the conditional probabilities which are stored in the network as certain values. However, if one takes either a subjectivistic or a limiting frequency approach to probability, one can never be certain of probability values. An algorithm should not only be capable of reporting the probabilities of the alternatives of remaining nodes when other nodes are instantiated; it should also be capable of reporting the uncertainty in these probabilities relative to the uncertainty in the probabilities which are stored in the network. In this paper a method for determining the variances in inferred probabilities is obtained under the assumption that a posterior distribution on the uncertainty variables can be approximated by the prior distribution. It is shown that this assumption is plausible if their is a reasonable amount of confidence in the probabilities which are stored in the network. Furthermore in this paper, a surprising upper bound for the prior variances in the probabilities of the alternatives of all nodes is obtained in the case where the probability distributions of the probabilities of the alternatives are beta distributions. It is shown that the prior variance in the probability at an alternative of a node is bounded above by the largest variance in an element of the conditional probability distribution for that node.


Accumulator Networks: Suitors of Local Probability Propagation

Frey, Brendan J., Kannan, Anitha

Neural Information Processing Systems

The sum-product algorithm can be directly applied in Gaussian networks and in graphs for coding, but for many conditional probabilityfunctions - including the sigmoid function - direct application of the sum-product algorithm is not possible. We introduce "accumulator networks" that have low local complexity (but exponential global complexity) so the sum-product algorithm can be directly applied. In an accumulator network, the probability of a child given its parents is computed by accumulating the inputs from the parents in a Markov chain or more generally a tree. After giving expressions for inference and learning in accumulator networks, wegive results on the "bars problem" and on the problem of extracting translated, overlapping faces from an image. 1 Introduction Graphical probability models with hidden variables are capable of representing complex dependenciesbetween variables, filling in missing data and making Bayesoptimal decisionsusing probabilistic inferences (Hinton and Sejnowski 1986; Pearl 1988; Neal 1992). Large, richly-connected networks with many cycles can potentially beused to model complex sources of data, such as audio signals, images and video. However, when the number of cycles in the network is large (more precisely, when the cut set size is exponential), exact inference becomes intractable. Also, to learn a probability model with hidden variables, we need to fill in the missing data using probabilistic inference, so learning also becomes intractable. To cope with the intractability of exact inference, a variety of approximate inference methods have been invented, including Monte Carlo (Hinton and Sejnowski 1986; Neal 1992), Helmholz machines (Dayan et al. 1995; Hinton et al. 1995), and variational techniques (Jordan et al. 1998).


Sequentially Fitting ``Inclusive'' Trees for Inference in Noisy-OR Networks

Frey, Brendan J., Patrascu, Relu, Jaakkola, Tommi, Moran, Jodi

Neural Information Processing Systems

Forexample, in medical diagnosis, the presence of a symptom can be expressed as a noisy-OR of the diseases that may cause the symptom - on some occasions, a disease may fail to activate the symptom. Inference in richly-connected noisy-OR networks is intractable, butapproximate methods (e .g., variational techniques) are showing increasing promise as practical solutions. One problem withmost approximations is that they tend to concentrate on a relatively small number of modes in the true posterior, ignoring otherplausible configurations of the hidden variables. We introduce a new sequential variational method for bipartite noisy OR networks, that favors including all modes of the true posterior and models the posterior distribution as a tree. We compare this method with other approximations using an ensemble of networks with network statistics that are comparable to the QMR-DT medical diagnosticnetwork. 1 Inclusive variational approximations Approximate algorithms for probabilistic inference are gaining in popularity and are now even being incorporated into VLSI hardware (T.


Sequentially Fitting ``Inclusive'' Trees for Inference in Noisy-OR Networks

Frey, Brendan J., Patrascu, Relu, Jaakkola, Tommi, Moran, Jodi

Neural Information Processing Systems

Exact inference in large, richly connected noisy-OR networks is intractable, and most approximate inference algorithms tend to concentrate on a small number of most probable configurations of the hidden variables under the posterior. We presented an "inclusive" variational method for bipartite noisy-OR networks that favors including all probable configurations, at the cost of including some improbable configurations. The method fits a tree to the posterior distribution sequentially, i.e., one observation at a time. Results on an ensemble of QMR-DT type networks show that the method performs better than local probability propagation and a variational upper bound for ranking most probable diseases.


Accumulator Networks: Suitors of Local Probability Propagation

Frey, Brendan J., Kannan, Anitha

Neural Information Processing Systems

One way to approximate inference in richly-connected graphical models is to apply the sum-product algorithm (a.k.a. The sum-product algorithm can be directly applied in Gaussian networks and in graphs for coding, but for many conditional probability functions - including the sigmoid function - direct application of the sum-product algorithm is not possible. We introduce "accumulator networks" that have low local complexity (but exponential global complexity) so the sum-product algorithm can be directly applied. In an accumulator network, the probability of a child given its parents is computed by accumulating the inputs from the parents in a Markov chain or more generally a tree. After giving expressions for inference and learning in accumulator networks, we give results on the "bars problem" and on the problem of extracting translated, overlapping faces from an image. 1 Introduction Graphical probability models with hidden variables are capable of representing complex dependencies between variables, filling in missing data and making Bayesoptimal decisions using probabilistic inferences (Hinton and Sejnowski 1986; Pearl 1988; Neal 1992).


Error-correcting Codes on a Bethe-like Lattice

Vicente, Renato, Saad, David, Kabashima, Yoshiyuki

Neural Information Processing Systems

We analyze Gallager codes by employing a simple mean-field approximation that distorts the model geometry and preserves important interactions between sites. The method naturally recovers the probability propagation decoding algorithm as an extremization of a proper free-energy. We find a thermodynamic phase transition that coincides with information theoretical upper-bounds and explain the practical code performance in terms of the free-energy landscape.


Accumulator Networks: Suitors of Local Probability Propagation

Frey, Brendan J., Kannan, Anitha

Neural Information Processing Systems

One way to approximate inference in richly-connected graphical models is to apply the sum-product algorithm (a.k.a. The sum-product algorithm can be directly applied in Gaussian networks and in graphs for coding, but for many conditional probability functions - including the sigmoid function - direct application of the sum-product algorithm is not possible. We introduce "accumulator networks" that have low local complexity (but exponential global complexity) so the sum-product algorithm can be directly applied. In an accumulator network, the probability of a child given its parents is computed by accumulating the inputs from the parents in a Markov chain or more generally a tree. After giving expressions for inference and learning in accumulator networks, we give results on the "bars problem" and on the problem of extracting translated, overlapping faces from an image. 1 Introduction Graphical probability models with hidden variables are capable of representing complex dependencies between variables, filling in missing data and making Bayesoptimal decisions using probabilistic inferences (Hinton and Sejnowski 1986; Pearl 1988; Neal 1992).


Sequentially Fitting ``Inclusive'' Trees for Inference in Noisy-OR Networks

Frey, Brendan J., Patrascu, Relu, Jaakkola, Tommi, Moran, Jodi

Neural Information Processing Systems

Exact inference in large, richly connected noisy-OR networks is intractable, and most approximate inference algorithms tend to concentrate on a small number of most probable configurations of the hidden variables under the posterior. We presented an "inclusive" variational method for bipartite noisy-OR networks that favors including all probable configurations, at the cost of including some improbable configurations. The method fits a tree to the posterior distribution sequentially, i.e., one observation at a time. Results on an ensemble of QMR-DT type networks show that the method performs better than local probability propagation and a variational upper bound for ranking most probable diseases.