Uncertainty
Myopic Bayesian Design of Experiments via Posterior Sampling and Probabilistic Programming
Kandasamy, Kirthevasan, Neiswanger, Willie, Zhang, Reed, Krishnamurthy, Akshay, Schneider, Jeff, Poczos, Barnabas
We design a new myopic strategy for a wide class of sequential design of experiment (DOE) problems, where the goal is to collect data in order to to fulfil a certain problem specific goal. Our approach, Myopic Posterior Sampling (MPS), is inspired by the classical posterior (Thompson) sampling algorithm for multi-armed bandits and leverages the flexibility of probabilistic programming and approximate Bayesian inference to address a broad set of problems. Empirically, this general-purpose strategy is competitive with more specialised methods in a wide array of DOE tasks, and more importantly, enables addressing complex DOE goals where no existing method seems applicable. On the theoretical side, we leverage ideas from adaptive submodularity and reinforcement learning to derive conditions under which MPS achieves sublinear regret against natural benchmark policies.
Pooling of Causal Models under Counterfactual Fairness via Causal Judgement Aggregation
Zennaro, Fabio Massimo, Ivanovska, Magdalena
In this paper we consider the problem of combining multiple probabilistic causal models, provided by different experts, under the requirement that the aggregated model satisfy the criterion of counterfactual fairness. We build upon the work on causal models and fairness in machine learning, and we express the problem of combining multiple models within the framework of opinion pooling. We propose two simple algorithms, grounded in the theory of counterfactual fairness and causal judgment aggregation, that are guaranteed to generate aggregated probabilistic causal models respecting the criterion of fairness, and we compare their behaviors on a toy case study.
Learning and Testing Causal Models with Interventions
Acharya, Jayadev, Bhattacharyya, Arnab, Daskalakis, Constantinos, Kandasamy, Saravanan
We consider testing and learning problems on causal Bayesian networks as defined by Pearl (Pearl, 2009). Given a causal Bayesian network $\mathcal{M}$ on a graph with $n$ discrete variables and bounded in-degree and bounded `confounded components', we show that $O(\log n)$ interventions on an unknown causal Bayesian network $\mathcal{X}$ on the same graph, and $\tilde{O}(n/\epsilon^2)$ samples per intervention, suffice to efficiently distinguish whether $\mathcal{X}=\mathcal{M}$ or whether there exists some intervention under which $\mathcal{X}$ and $\mathcal{M}$ are farther than $\epsilon$ in total variation distance. We also obtain sample/time/intervention efficient algorithms for: (i) testing the identity of two unknown causal Bayesian networks on the same graph; and (ii) learning a causal Bayesian network on a given graph. Although our algorithms are non-adaptive, we show that adaptivity does not help in general: $\Omega(\log n)$ interventions are necessary for testing the identity of two unknown causal Bayesian networks on the same graph, even adaptively. Our algorithms are enabled by a new subadditivity inequality for the squared Hellinger distance between two causal Bayesian networks.
A distinct approach to diagnose Dengue Fever with the help of Soft Set Theory
Bukhari, Syeda fariha, Amjad, Maaz
Mathematics has played a substantial role to revolutionize the medical science. Intelligent systems based on mathematical theories have proved to be efficient in diagnosing various diseases. In this paper, we used an expert system based on soft set theory and fuzzy set theory named as a soft expert system to diagnose tropical disease dengue. The objective to use soft expert system is to predict the risk level of a patient having dengue fever by using input variables like age, TLC, SGOT, platelets count and blood pressure. The proposed method explicitly demonstrates the exact percentage of the risk level of dengue fever automatically circumventing for all possible (medical) imprecisions.
Scalable Bayesian Learning for State Space Models using Variational Inference with SMC Samplers
Hirt, Marcel, Dellaportas, Petros
We present a scalable approach to performing approximate fully Bayesian inference in generic state space models. The proposed method is an alternative to particle MCMC that provides full Bayesian inference of both the dynamic latent states and the static parameters of the model. We build up on recent advances in computational statistics that combine variational methods with sequential Monte Carlo sampling and we demonstrate the advantages of performing full Bayesian inference over the static parameters rather than just performing variational EM approximations. We illustrate how our approach enables scalable inference in multivariate stochastic volatility models and self-exciting point process models that allow for flexible dynamics in the latent intensity function.
Likelihood-free inference with emulator networks
Lueckmann, Jan-Matthis, Bassetto, Giacomo, Karaletsos, Theofanis, Macke, Jakob H.
Approximate Bayesian Computation (ABC) provides methods for Bayesian inference in simulation-based stochastic models which do not permit tractable likelihoods. We present a new ABC method which uses probabilistic neural emulator networks to learn synthetic likelihoods on simulated data - both'local' emulators which approximate the likelihood for specific observed data, as well as'global' ones which are applicable to a range of data. Simulations are chosen adaptively using an acquisition function which takes into account uncertainty about either the posterior distribution of interest, or the parameters of the emulator. Our approach does not rely on user-defined rejection thresholds or distance functions. We illustrate inference with emulator networks on synthetic examples and on a biophysical neuron model, and show that emulators allow accurate and efficient inference even on high-dimensional problems which are challenging for conventional ABC approaches.
Large Data and Zero Noise Limits of Graph-Based Semi-Supervised Learning Algorithms
Dunlop, Matthew M., Slepčev, Dejan, Stuart, Andrew M., Thorpe, Matthew
Scalings in which the graph Laplacian approaches a differential operator in the large graph limit are used to develop understanding of a number of algorithms for semi-supervised learning; in particular the extension, to this graph setting, of the probit algorithm, level set and kriging methods, are studied. Both optimization and Bayesian approaches are considered, based around a regularizing quadratic form found from an affine transformation of the Laplacian, raised to a, possibly fractional, exponent. Conditions on the parameters defining this quadratic form are identified under which well-defined limiting continuum analogues of the optimization and Bayesian semi-supervised learning problems may be found, thereby shedding light on the design of algorithms in the large graph setting. The large graph limits of the optimization formulations are tackled through $\Gamma$-convergence, using the recently introduced $TL^p$ metric. The small labelling noise limit of the Bayesian formulations are also identified, and contrasted with pre-existing harmonic function approaches to the problem.
Particle Filter Networks: End-to-End Probabilistic Localization From Visual Observations
Karkus, Peter, Hsu, David, Lee, Wee Sun
Particle filters sequentially approximate posterior distributions by sampling representative points and updating them independently. The idea is applied in various domains, e.g. reasoning with uncertainty in robotics. A remaining challenge is constructing probabilistic models of the system, which can be especially hard for complex sensors, e.g. a camera. We introduce the Particle Filter Networks (PF-nets) that encode both a learned probabilistic system model and the particle filter algorithm in a single neural network architecture. The unified representation allows learning models end-to-end, circumventing the difficulties of conventional model-based methods. We applied PF-nets to a challenging visual localization task that requires matching visual features from camera images with the geometry encoded in a 2-D floor map. In preliminary experiments end-to-end PF-nets consistently outperformed alternative learning architectures, as well as conventional model-based methods.
Bayesian Inference of Regular Expressions from Human-Generated Example Strings
In programming by example, users "write" programs by generating a small number of input-output examples and asking the computer to synthesize consistent programs. We consider an unsolved problem in this domain: learning regular expressions (regexes) from positive and negative example strings. This problem is challenging, as (1) user-generated examples may not be informative enough to sufficiently constrain the hypothesis space, and (2) even if user-generated examples are in principle informative, there is still a massive search space to examine. We frame regex induction as the problem of inferring a probabilistic regular grammar and propose an efficient inference approach that uses a novel stochastic process recognition model. This model incrementally "grows" a grammar using positive examples as a scaffold. We show that this approach is competitive with human ability to learn regexes from examples.
Functional Decision Theory: A New Theory of Instrumental Rationality
Yudkowsky, Eliezer, Soares, Nate
This paper describes and motivates a new decision theory known as functional decision theory (FDT), as distinct from causal decision theory and evidential decision theory. Functional decision theorists hold that the normative principle for action is to treat one's decision as the output of a fixed mathematical function that answers the question, "Which output of this very function would yield the best outcome?" Adhering to this principle delivers a number of benefits, including the ability to maximize wealth in an array of traditional decision-theoretic and game-theoretic problems where CDT and EDT perform poorly. Using one simple and coherent decision rule, functional decision theorists (for example) achieve more utility than CDT on Newcomb's problem, more utility than EDT on the smoking lesion problem, and more utility than both in Parfit's hitchhiker problem. In this paper, we define FDT, explore its prescriptions in a number of different decision problems, compare it to CDT and EDT, and give philosophical justifications for FDT as a normative theory of decision-making.