Bayesian Learning
Hybrid Processing of Beliefs and Constraints
Dechter, Rina, Larkin, David Ephraim
This paper explores algorithms for processing probabilistic and deterministic information when the former is represented as a belief network and the latter as a set of boolean clauses. The motivating tasks are 1. evaluating beliefs networks having a large number of deterministic relationships and2. evaluating probabilities of complex boolean querie over a belief network. We propose a parameterized family of variable elimination algorithms that exploit both types of information, and that allows varying levels of constraint propagation inferences. The complexity of the scheme is controlled by the induced-width of the graph {em augmented} by the dependencies introduced by the boolean constraints. Preliminary empirical evaluation demonstrate the effect of constraint propagation on probabilistic computation.
Using Bayesian Networks to Identify the Causal Effect of Speeding in Individual Vehicle/Pedestrian Collisions
On roads showing significant violations of posted speed limits, one measure of the safety effect of speeding is the difference between the road's actual accident count and the count that would have occurred if the posted speed limit had been strictly obeyed. An estimate of this accident reduction can be had by computing the probability that speeding was a necessary condition for each of set of accidents. This is an instance of assessing individual probabilities of causation, which is generally not possible absent prior knowledge of causal structure. For traffic accidents such prior knowledge is often available and this paper illustrates how, for a commonly occurring class of vehicle/pedestrian accidents, approaches to uncertainty and causal analyses appearing in the accident reconstruction literature can be unified using Bayesian networks. Measured skidmarks, pedestrian throw distances, and pedestrian injury severity are treated as evidence, and using the Gibbs Sampling routine BUGS, the posterior probability distribution over exogenous variables, such as the vehicle's initial speed, location, and driver reaction time, is computed. This posterior distribution is then used to compute the "probability of necessity" for speeding.
Linearity Properties of Bayes Nets with Binary Variables
It is "well known" that in linear models: (1) testable constraints on the marginal distribution of observed variables distinguish certain cases in which an unobserved cause jointly influences several observed variables; (2) the technique of "instrumental variables" sometimes permits an estimation of the influence of one variable on another even when the association between the variables may be confounded by unobserved common causes; (3) the association (or conditional probability distribution of one variable given another) of two variables connected by a path or trek can be computed directly from the parameter values associated with each edge in the path or trek; (4) the association of two variables produced by multiple treks can be computed from the parameters associated with each trek; and (5) the independence of two variables conditional on a third implies the corresponding independence of the sums of the variables over all units conditional on the sums over all units of each of the original conditioning variables.These properties are exploited in search procedures. It is also known that properties (2)-(5) do not hold for all Bayes nets with binary variables. We show that (1) holds for all Bayes nets with binary variables and (5) holds for all singly trek-connected Bayes nets of that kind. We further show that all five properties hold for Bayes nets with any DAG and binary variables parameterized with noisy-or and noisy-and gates.
Conditions Under Which Conditional Independence and Scoring Methods Lead to Identical Selection of Bayesian Network Models
It is often stated in papers tackling the task of inferring Bayesian network structures from data that there are these two distinct approaches: (i) Apply conditional independence tests when testing for the presence or otherwise of edges; (ii) Search the model space using a scoring metric. Here I argue that for complete data and a given node ordering this division is a myth, by showing that cross entropy methods for checking conditional independence are mathematically identical to methods based upon discriminating between models by their overall goodness-of-fit logarithmic scores.
Confidence Inference in Bayesian Networks
Cheng, Jian, Druzdzel, Marek J.
We present two sampling algorithms for probabilistic confidence inference in Bayesian networks. These two algorithms (we call them AIS-BN-mu and AIS-BN-sigma algorithms) guarantee that estimates of posterior probabilities are with a given probability within a desired precision bound. Our algorithms are based on recent advances in sampling algorithms for (1) estimating the mean of bounded random variables and (2) adaptive importance sampling in Bayesian networks. In addition to a simple stopping rule for sampling that they provide, the AIS-BN-mu and AIS-BN-sigma algorithms are capable of guiding the learning process in the AIS-BN algorithm. An empirical evaluation of the proposed algorithms shows excellent performance, even for very unlikely evidence.
Instrumentality Tests Revisited
An instrument is a random variable thatallows the identification of parameters inlinear models when the error terms arenot uncorrelated.It is a popular method used in economicsand the social sciences that reduces theproblem of identification to the problemof finding the appropriate instruments.Few years ago, Pearl introduced a necessarytest for instruments that allows the researcher to discard those candidatesthat fail the test.In this paper, we make a detailed study of Pearl's test and the general model forinstruments. The results of this studyinclude a novel interpretation of Pearl'stest, a general theory of instrumentaltests, and an affirmative answer to aprevious conjecture. We also presentnew instrumentality tests for the casesof discrete and continuous variables.
Markov Chain Monte Carlo using Tree-Based Priors on Model Structure
Angelopoulos, Nicos, Cussens, James
We present a general framework for defining priors on model structure and sampling from the posterior using the Metropolis-Hastings algorithm. The key idea is that structure priors are defined via a probability tree and that the proposal mechanism for the Metropolis-Hastings algorithm operates by traversing this tree, thereby defining a cheaply computable acceptance probability. We have applied this approach to Bayesian net structure learning using a number of priors and tree traversal strategies. Our results show that these must be chosen appropriately for this approach to be successful.
Artificial Intelligence Framework for Simulating Clinical Decision-Making: A Markov Decision Process Approach
Bennett, Casey C., Hauser, Kris
In the modern healthcare system, rapidly expanding costs/complexity, the growing myriad of treatment options, and exploding information streams that often do not effectively reach the front lines hinder the ability to choose optimal treatment decisions over time. The goal in this paper is to develop a general purpose (non-disease-specific) computational/artificial intelligence (AI) framework to address these challenges. This serves two potential functions: 1) a simulation environment for exploring various healthcare policies, payment methodologies, etc., and 2) the basis for clinical artificial intelligence - an AI that can think like a doctor. This approach combines Markov decision processes and dynamic decision networks to learn from clinical data and develop complex plans via simulation of alternative sequential decision paths while capturing the sometimes conflicting, sometimes synergistic interactions of various components in the healthcare system. It can operate in partially observable environments (in the case of missing observations or data) by maintaining belief states about patient health status and functions as an online agent that plans and re-plans. This framework was evaluated using real patient data from an electronic health record. Such an AI framework easily outperforms the current treatment-as-usual (TAU) case-rate/fee-for-service models of healthcare (Cost per Unit Change: $189 vs. $497) while obtaining a 30-35% increase in patient outcomes. Tweaking certain model parameters further enhances this advantage, obtaining roughly 50% more improvement for roughly half the costs. Given careful design and problem formulation, an AI simulation framework can approximate optimal decisions even in complex and uncertain environments. Future work is described that outlines potential lines of research and integration of machine learning algorithms for personalized medicine.
Heteroscedastic Relevance Vector Machine
Khashabi, Daniel, Ziyadi, Mojtaba, Liang, Feng
In this work we propose a heteroscedastic generalization to RVM, a fast Bayesian framework for regression, based on some recent similar works. We use variational approximation and expectation propagation to tackle the problem. The work is still under progress and we are examining the results and comparing with the previous works.
Role Mining with Probabilistic Models
Frank, Mario, Buhmann, Joachim M., Basin, David
Role mining tackles the problem of finding a role-based access control (RBAC) configuration, given an access-control matrix assigning users to access permissions as input. Most role mining approaches work by constructing a large set of candidate roles and use a greedy selection strategy to iteratively pick a small subset such that the differences between the resulting RBAC configuration and the access control matrix are minimized. In this paper, we advocate an alternative approach that recasts role mining as an inference problem rather than a lossy compression problem. Instead of using combinatorial algorithms to minimize the number of roles needed to represent the access-control matrix, we derive probabilistic models to learn the RBAC configuration that most likely underlies the given matrix. Our models are generative in that they reflect the way that permissions are assigned to users in a given RBAC configuration. We additionally model how user-permission assignments that conflict with an RBAC configuration emerge and we investigate the influence of constraints on role hierarchies and on the number of assignments. In experiments with access-control matrices from real-world enterprises, we compare our proposed models with other role mining methods. Our results show that our probabilistic models infer roles that generalize well to new system users for a wide variety of data, while other models' generalization abilities depend on the dataset given.