Bayesian Learning
A Linear Belief Function Approach to Portfolio Evaluation
Liu, Liping, Shenoy, Catherine, Shenoy, Prakash P.
By elaborating on the notion of linear belief functions (Dempster 1990; Liu 1996), we propose an elementary approach to knowledge representation for expert systems using linear belief functions. We show how to use basic matrices to represent market information and financial knowledge, including complete ignorance, statistical observations, subjective speculations, distributional assumptions, linear relations, and empirical asset pricing models. We then appeal to Dempster's rule of combination to integrate the knowledge for assessing an overall belief of portfolio performance, and updating the belief by incorporating additional information. We use an example of three gold stocks to illustrate the approach.
Large-Sample Learning of Bayesian Networks is NP-Hard
Chickering, David Maxwell, Meek, Christopher, Heckerman, David
In this paper, we provide new complexity results for algorithms that learn discrete-variable Bayesian networks from data. Our results apply whenever the learning algorithm uses a scoring criterion that favors the simplest model able to represent the generative distribution exactly. Our results therefore hold whenever the learning algorithm uses a consistent scoring criterion and is applied to a sufficiently large dataset. We show that identifying high-scoring structures is hard, even when we are given an independence oracle, an inference oracle, and/or an information oracle. Our negative results also apply to the learning of discrete-variable Bayesian networks in which each node has at most k parents, for all k > 3.
A Robust Independence Test for Constraint-Based Learning of Causal Structure
Dash, Denver, Druzdzel, Marek J.
Constraint-based (CB) learning is a formalism for learning a causal network with a database D by performing a series of conditional-independence tests to infer structural information. This paper considers a new test of independence that combines ideas from Bayesian learning, Bayesian network inference, and classical hypothesis testing to produce a more reliable and robust test. The new test can be calculated in the same asymptotic time and space required for the standard tests such as the chi-squared test, but it allows the specification of a prior distribution over parameters and can be used when the database is incomplete. We prove that the test is correct, and we demonstrate empirically that, when used with a CB causal discovery algorithm with noninformative priors, it recovers structural features more reliably and it produces networks with smaller KL-Divergence, especially as the number of nodes increases or the number of records decreases. Another benefit is the dramatic reduction in the probability that a CB algorithm will stall during the search, providing a remedy for an annoying problem plaguing CB learning when the database is small.
A Simple Insight into Iterative Belief Propagation's Success
Dechter, Rina, Mateescu, Robert
In Non - ergodic belief networks the posterior belief OF many queries given evidence may become zero.The paper shows that WHEN belief propagation IS applied iteratively OVER arbitrary networks(the so called, iterative OR loopy belief propagation(IBP)) it IS identical TO an arc - consistency algorithm relative TO zero - belief queries(namely assessing zero posterior probabilities). This implies that zero - belief conclusions derived BY belief propagation converge AND are sound.More importantly it suggests that the inference power OF IBP IS AS strong AND AS weak, AS that OF arc - consistency.This allows the synthesis OF belief networks FOR which belief propagation IS useless ON one hand, AND focuses the investigation OF classes OF belief network FOR which belief propagation may be zero - complete.Finally, ALL the above conclusions apply also TO Generalized belief propagation algorithms that extend loopy belief propagation AND allow a crisper understanding OF their power.
Symbolic Generalization for On-line Planning
Feng, Zhengzhu, Hansen, Eric A., Zilberstein, Shlomo
Symbolic representations have been used successfully in off-line planning algorithms for Markov decision processes. We show that they can also improve the performance of on-line planners. In addition to reducing computation time, symbolic generalization can reduce the amount of costly real-world interactions required for convergence. We introduce Symbolic Real-Time Dynamic Programming (or sRTDP), an extension of RTDP. After each step of on-line interaction with an environment, sRTDP uses symbolic model-checking techniques to generalizes its experience by updating a group of states rather than a single state. We examine two heuristic approaches to dynamic grouping of states and show that they accelerate the planning process significantly in terms of both CPU time and the number of steps of interaction with the environment.
Incremental Compilation of Bayesian networks
Flores, Julia M., Gamez, Jose A., Olesen, Kristian G.
Most methods of exact probability propagation in Bayesian networks do not carry out the inference directly over the network, but over a secondary structure known as a junction tree or a join tree (JT). The process of obtaining a JT is usually termed {sl compilation}. As compilation is usually viewed as a whole process; each time the network is modified, a new compilation process has to be carried out. The possibility of reusing an already existing JT, in order to obtain the new one regarding only the modifications in the network has received only little attention in the literature. In this paper we present a method for incremental compilation of a Bayesian network, following the classical scheme in which triangulation plays the key role. In order to perform incremental compilation we propose to recompile only those parts of the JT which can have been affected by the networks modifications. To do so, we exploit the technique OF maximal prime subgraph decomposition in determining the minimal subgraph(s) that have to be recompiled, and thereby the minimal subtree(s) of the JT that should be replaced by new subtree(s).We focus on structural modifications : addition and deletion of links and variables.
New Advances in Inference by Recursive Conditioning
Recursive Conditioning (RC) was introduced recently as the first any-space algorithm for inference in Bayesian networks which can trade time for space by varying the size of its cache at the increment needed to store a floating point number. Under full caching, RC has an asymptotic time and space complexity which is comparable to mainstream algorithms based on variable elimination and clustering (exponential in the network treewidth and linear in its size). We show two main results about RC in this paper. First, we show that its actual space requirements under full caching are much more modest than those needed by mainstream methods and study the implications of this finding. Second, we show that RC can effectively deal with determinism in Bayesian networks by employing standard logical techniques, such as unit resolution, allowing a significant reduction in its time requirements in certain cases. We illustrate our results using a number of benchmark networks, including the very challenging ones that arise in genetic linkage analysis.
Value Elimination: Bayesian Inference via Backtracking Search
Bacchus, Fahiem, Dalmao, Shannon, Pitassi, Toniann
Backtracking search is a powerful algorithmic paradigm that can be used to solve many problems. It is in a certain sense the dual of variable elimination; but on many problems, e.g., SAT, it is vastly superior to variable elimination in practice. Motivated by this we investigate the application of backtracking search to the problem of Bayesian inference (Bayes). We show that natural generalizations of known techniques allow backtracking search to achieve performance guarantees similar to standard algorithms for Bayes, and that there exist problems on which backtracking can in fact do much better. We also demonstrate that these ideas can be applied to implement a Bayesian inference engine whose performance is competitive with standard algorithms. Since backtracking search can very naturally take advantage of context specific structure, the potential exists for performance superior to standard algorithms on many problems.
An Empirical Study of w-Cutset Sampling for Bayesian Networks
Bidyuk, Bozhena, Dechter, Rina
The paper studies empirically the time-space trade-off between sampling and inference in a sl cutset sampling algorithm. The algorithm samples over a subset of nodes in a Bayesian network and applies exact inference over the rest. Consequently, while the size of the sampling space decreases, requiring less samples for convergence, the time for generating each single sample increases. The w-cutset sampling selects a sampling set such that the induced-width of the network when the sampling set is observed is bounded by w, thus requiring inference whose complexity is exponential in w. In this paper, we investigate performance of w-cutset sampling over a range of w values and measure the accuracy of w-cutset sampling as a function of w. Our experiments demonstrate that the cutset sampling idea is quite powerful showing that an optimal balance between inference and sampling benefits substantially from restricting the cutset size, even at the cost of more complex inference.
On Triangulating Dynamic Graphical Models
Bilmes, Jeff A., Bartels, Chris
This paper introduces new methodology to triangulate dynamic Bayesian networks (DBNs) and dynamic graphical models (DGMs). While most methods to triangulate such networks use some form of constrained elimination scheme based on properties of the underlying directed graph, we find it useful to view triangulation and elimination using properties only of the resulting undirected graph, obtained after the moralization step. We first briefly introduce the Graphical model toolkit (GMTK) and its notion of dynamic graphical models, one that slightly extends the standard notion of a DBN. We next introduce the 'boundary algorithm', a method to find the best boundary between partitions in a dynamic model. We find that using this algorithm, the notions of forward- and backward-interface become moot - namely, the size and fill-in of the best forward- and backward- interface are identical. Moreover, we observe that finding a good partition boundary allows for constrained elimination orders (and therefore graph triangulations) that are not possible using standard slice-by-slice constrained eliminations. More interestingly, with certain boundaries it is possible to obtain constrained elimination schemes that lie outside the space of possible triangulations using only unconstrained elimination. Lastly, we report triangulation results on invented graphs, standard DBNs from the literature, novel DBNs used in speech recognition research systems, and also random graphs. Using a number of different triangulation quality measures (max clique size, state-space, etc.), we find that with our boundary algorithm the triangulation quality can dramatically improve.