Bayesian Inference
Spatio-Temporal Graphical Model Selection
Harrington, Patrick L. Jr., Hero, Alfred O. III
This paper treats the problem of learning the interaction structure of a spatiotemporal graphical model for a discrete state and discrete time stochastic process known as the susceptible, infected, recovered (SIR) model. The presence of spatial interactions cause adjacent nodes in the graph to affect each others states over time. Learning the topology of this graph is known as model selection. We cast this graphical model selection problem as a penalized likelihood problem, resulting in a convex program for which convex optimization solvers can be applied. SIR spatiotemporal graphical models are commonly used in modeling the random propagation of information between nodes in large networks in bioinformatics, signal processing, public health, and national security (4; 9; 21). Knowing the network link structure allows accurate prediction of individual node states and can aid the development of control and intervention strategies for such networks. This paper develops a tractable method to estimate the topology of the network for the SIR spatiotemporal graphical model from empirical data. Exact solutions of the graphical model selection problem is NP hard due to the combinatorial nature of enumeration through the discrete space of possible graph topologies.
A Minimum Relative Entropy Principle for Learning and Acting
Ortega, Pedro A., Braun, Daniel A.
This paper proposes a method to construct an adaptive agent that is universal with respect to a given class of experts, where each expert is an agent that has been designed specifically for a particular environment. This adaptive control problem is formalized as the problem of minimizing the relative entropy of the adaptive agent from the expert that is most suitable for the unknown environment. If the agent is a passive observer, then the optimal solution is the well-known Bayesian predictor. However, if the agent is active, then its past actions need to be treated as causal interventions on the I/O stream rather than normal probability conditions. Here it is shown that the solution to this new variational problem is given by a stochastic controller called the Bayesian control rule, which implements adaptive behavior as a mixture of experts. Furthermore, it is shown that under mild assumptions, the Bayesian control rule converges to the control law of the most suitable expert.
An Agile and Accessible Adaptation of Bayesian Inference to Medical Diagnostics for Rural Health Extension Workers
Robertson, Joel (Robertson Research Institute) | DeHart, Del J. (Robertson Research Institute)
We have adapted an expert system of medical diagnosis for use by low to mid-level health workers in remote and rural locations. Key to the successful deployment of this expert system is the rapid adaptation of the database and clinical interface for use in specific regions and by varying user skill.
A Model for Quality of Schooling
Moussavi, Massoud (Causal Links, LLC) | McGinn, Noel (Causal Links, LLC)
A key challenge for policymakers in many developing countries is to decide which intervention or collection of interventions works best to improve learning outcomes in their schools. Our aim is to develop a causal model that explains student learning outcomes in terms of observable characteristics as well as conditions and processes difficult to observe directly. We start with a theoretical model based on the results of previous research, direct experience and expertsโ knowledge in the field. This model is then refined through application of supervised learning methods to available data sets. Once calibrated with local data in a country, the model estimates the probability that a given intervention would affect learning outcomes.
Join-Graph Propagation Algorithms
Mateescu, R., Kask, K., Gogate, V., Dechter, R.
The paper investigates parameterized approximate message-passing schemes that are based on bounded inference and are inspired by Pearls belief propagation algorithm (BP). We start with the bounded inference mini-clustering algorithm and then move to the iterative scheme called Iterative Join-Graph Propagation (IJGP), that combines both iteration and bounded inference. Algorithm IJGP belongs to the class of Generalized Belief Propagation algorithms, a framework that allowed connections with approximate algorithms from statistical physics and is shown empirically to surpass the performance of mini-clustering and belief propagation, as well as a number of other state-of-the-art algorithms on several classes of networks. We also provide insight into the accuracy of iterative BP and IJGP by relating these algorithms to well known classes of constraint propagation schemes.
What does Newcomb's paradox teach us?
Wolpert, David H., Benford, Gregory
In Newcomb's paradox you choose to receive either the contents of a particular closed box, or the contents of both that closed box and another one. Before you choose, a prediction algorithm deduces your choice, and fills the two boxes based on that deduction. Newcomb's paradox is that game theory appears to provide two conflicting recommendations for what choice you should make in this scenario. We analyze Newcomb's paradox using a recent extension of game theory in which the players set conditional probability distributions in a Bayes net. We show that the two game theory recommendations in Newcomb's scenario have different presumptions for what Bayes net relates your choice and the algorithm's prediction. We resolve the paradox by proving that these two Bayes nets are incompatible. We also show that the accuracy of the algorithm's prediction, the focus of much previous work, is irrelevant. In addition we show that Newcomb's scenario only provides a contradiction between game theory's expected utility and dominance principles if one is sloppy in specifying the underlying Bayes net. We also show that Newcomb's paradox is time-reversal invariant; both the paradox and its resolution are unchanged if the algorithm makes its `prediction' after you make your choice rather than before.
Feature Importance in Bayesian Assessment of Newborn Brain Maturity from EEG
Jakaite, L., Schetinin, V., Maple, C.
The methodology of Bayesian Model Averaging (BMA) is applied for assessment of newborn brain maturity from sleep EEG. In theory this methodology provides the most accurate assessments of uncertainty in decisions. However, the existing BMA techniques have been shown providing biased assessments in the absence of some prior information enabling to explore model parameter space in details within a reasonable time. The lack in details leads to disproportional sampling from the posterior distribution. In case of the EEG assessment of brain maturity, BMA results can be biased because of the absence of information about EEG feature importance. In this paper we explore how the posterior information about EEG features can be used in order to reduce a negative impact of disproportional sampling on BMA performance. We use EEG data recorded from sleeping newborns to test the efficiency of the proposed BMA technique.
Syntactic Topic Models
Boyd-Graber, Jordan, Blei, David M.
The syntactic topic model (STM) is a Bayesian nonparametric model of language that discovers latent distributions of words (topics) that are both semantically and syntactically coherent. The STM models dependency parsed corpora where sentences are grouped into documents. It assumes that each word is drawn from a latent topic chosen by combining document-level features and the local syntactic context. Each document has a distribution over latent topics, as in topic models, which provides the semantic consistency. Each element in the dependency parse tree also has a distribution over the topics of its children, as in latent-state syntax models, which provides the syntactic consistency. These distributions are convolved so that the topic of each word is likely under both its document and syntactic context. We derive a fast posterior inference algorithm based on variational methods. We report qualitative and quantitative studies on both synthetic data and hand-parsed documents. We show that the STM is a more predictive model of language than current models based only on syntax or only on topics.
Convergence of Bayesian Control Rule
Ortega, Pedro A., Braun, Daniel A.
Recently, new approaches to adaptive control have sought to reformulate the problem as a minimization of a relative entropy criterion to obtain tractable solutions. In particular, it has been shown that minimizing the expected deviation from the causal input-output dependencies of the true plant leads to a new promising stochastic control rule called the Bayesian control rule. This work proves the convergence of the Bayesian control rule under two sufficient assumptions: boundedness, which is an ergodicity condition; and consistency, which is an instantiation of the sure-thing principle.
Modeling of Human Criminal Behavior using Probabilistic Networks
Pillai, Ramesh Kumar Gopala, P, Dr. Ramakanth Kumar .
Currently, criminal's profile (CP) is obtained from investigator's or forensic psychologist's interpretation, linking crime scene characteristics and an offender's behavior to his or her characteristics and psychological profile. This paper seeks an efficient and systematic discovery of non-obvious and valuable patterns between variables from a large database of solved cases via a probabilistic network (PN) modeling approach. The PN structure can be used to extract behavioral patterns and to gain insight into what factors influence these behaviors. Thus, when a new case is being investigated and the profile variables are unknown because the offender has yet to be identified, the observed crime scene variables are used to infer the unknown variables based on their connections in the structure and the corresponding numerical (probabilistic) weights. The objective is to produce a more systematic and empirical approach to profiling, and to use the resulting PN model as a decision tool.