Uncertainty
Faster Algorithms for Max-Product Message-Passing
McAuley, Julian J., Caetano, Tiberio S.
Maximum A Posteriori inference in graphical models is often solved via message-passing algorithms, such as the junction-tree algorithm, or loopy belief-propagation. The exact solution to this problem is well known to be exponential in the size of the model's maximal cliques after it is triangulated, while approximate inference is typically exponential in the size of the model's factors. In this paper, we take advantage of the fact that many models have maximal cliques that are larger than their constituent factors, and also of the fact that many factors consist entirely of latent variables (i.e., they do not depend on an observation). This is a common case in a wide variety of applications, including grids, trees, and ring-structured models. In such cases, we are able to decrease the exponent of complexity for message-passing by 0.5 for both exact and approximate inference.
On Tsallis Entropy Bias and Generalized Maximum Entropy Models
Hou, Yuexian, Yan, Tingxu, Zhang, Peng, Song, Dawei, Li, Wenjie
In density estimation task, maximum entropy model (Maxent) can effectively use reliable prior information via certain constraints, i.e., linear constraints without empirical parameters. However, reliable prior information is often insufficient, and the selection of uncertain constraints becomes necessary but poses considerable implementation complexity. Improper setting of uncertain constraints can result in overfitting or underfitting. To solve this problem, a generalization of Maxent, under Tsallis entropy framework, is proposed. The proposed method introduces a convex quadratic constraint for the correction of (expected) Tsallis entropy bias (TEB). Specifically, we demonstrate that the expected Tsallis entropy of sampling distributions is smaller than the Tsallis entropy of the underlying real distribution. This expected entropy reduction is exactly the (expected) TEB, which can be expressed by a closed-form formula and act as a consistent and unbiased correction. TEB indicates that the entropy of a specific sampling distribution should be increased accordingly. This entails a quantitative re-interpretation of the Maxent principle. By compensating TEB and meanwhile forcing the resulting distribution to be close to the sampling distribution, our generalized TEBC Maxent can be expected to alleviate the overfitting and underfitting. We also present a connection between TEB and Lidstone estimator. As a result, TEB-Lidstone estimator is developed by analytically identifying the rate of probability correction in Lidstone. Extensive empirical evaluation shows promising performance of both TEBC Maxent and TEB-Lidstone in comparison with various state-of-the-art density estimation methods.
Incorporating Side Information in Probabilistic Matrix Factorization with Gaussian Processes
Adams, Ryan Prescott, Dahl, George E., Murray, Iain
Probabilistic matrix factorization (PMF) is a powerful method for modeling data associated with pairwise relationships, finding use in collaborative filtering, computational biology, and document analysis, among other areas. In many domains, there is additional information that can assist in prediction. For example, when modeling movie ratings, we might know when the rating occurred, where the user lives, or what actors appear in the movie. It is difficult, however, to incorporate this side information into the PMF model. We propose a framework for incorporating side information by coupling together multiple PMF problems via Gaussian process priors. We replace scalar latent features with functions that vary over the space of side information. The GP priors on these functions require them to vary smoothly and share information. We successfully use this new method to predict the scores of professional basketball games, where side information about the venue and date of the game are relevant for the outcome.
Mimicking the Behaviour of Idiotypic AIS Robot Controllers Using Probabilistic Systems
Whitbrook, Amanda, Aickelin, Uwe, Garibaldi, Jonathan
Previous work has shown that robot navigation systems that employ an architecture based upon the idiotypic network theory of the immune system have an advantage over control techniques that rely on reinforcement learning only. This is thought to be a result of intelligent behaviour selection on the part of the idiotypic robot. In this paper an attempt is made to imitate idiotypic dynamics by creating controllers that use reinforcement with a number of different probabilistic schemes to select robot behaviour. The aims are to show that the idiotypic system is not merely performing some kind of periodic random behaviour selection, and to try to gain further insight into the processes that govern the idiotypic mechanism. Trials are carried out using simulated Pioneer robots that undertake navigation exercises. Results show that a scheme that boosts the probability of selecting highly-ranked alternative behaviours to 50% during stall conditions comes closest to achieving the properties of the idiotypic system, but remains unable to match it in terms of all round performance.
An Agile and Accessible Adaptation of Bayesian Inference to Medical Diagnostics for Rural Health Extension Workers
Robertson, Joel (Robertson Research Institute) | DeHart, Del J. (Robertson Research Institute)
We have adapted an expert system of medical diagnosis for use by low to mid-level health workers in remote and rural locations. Key to the successful deployment of this expert system is the rapid adaptation of the database and clinical interface for use in specific regions and by varying user skill.
Traffic Flow Monitoring in Crowded Cities
Quinn, John Alexander (Makerere University) | Nakibuule, Rose (Makerere University)
Traffic monitoring systems usually make assumptions about the movement of vehicles, such as that they drive in dedicated lanes, and that those lanes rarely include non-vehicle clutter. Urban settings within developing countries often present extremely chaotic traffic scenarios which make these assumptions unrealistic. We show how a standard approach to traffic monitoring can be made more robust by using probabilistic inference, and in such a way that we bypass the need for vehicle segmentation. Instead of tracking individual vehicles but treat a lane of traffic as a fluid and estimate the rate of flow. Our modelling of uncertainty allows us to accurately monitor traffic flow even in the presence of substantial clutter.
A Model for Quality of Schooling
Moussavi, Massoud (Causal Links, LLC) | McGinn, Noel (Causal Links, LLC)
A key challenge for policymakers in many developing countries is to decide which intervention or collection of interventions works best to improve learning outcomes in their schools. Our aim is to develop a causal model that explains student learning outcomes in terms of observable characteristics as well as conditions and processes difficult to observe directly. We start with a theoretical model based on the results of previous research, direct experience and experts’ knowledge in the field. This model is then refined through application of supervised learning methods to available data sets. Once calibrated with local data in a country, the model estimates the probability that a given intervention would affect learning outcomes.
Elliptical slice sampling
Murray, Iain, Adams, Ryan Prescott, MacKay, David J. C.
Many probabilistic models introduce strong dependencies between variables using a latent multivariate Gaussian distribution or a Gaussian process. We present a new Markov chain Monte Carlo algorithm for performing inference in models with multivariate Gaussian priors. Its key properties are: 1) it has simple, generic code applicable to many models, 2) it has no free parameters, 3) it works well for a variety of Gaussian process based models. These properties make our method ideal for use while model building, removing the need to spend time deriving and tuning updates for more complex algorithms.
On the Failure of the Finite Model Property in some Fuzzy Description Logics
Bobillo, Fernando, Bou, Felix, Straccia, Umberto
Fuzzy Description Logics (DLs) are a family of logics which allow the representation of (and the reasoning with) structured knowledge affected by vagueness. Although most of the not very expressive crisp DLs, such as ALC, enjoy the Finite Model Property (FMP), this is not the case once we move into the fuzzy case. In this paper we show that if we allow arbitrary knowledge bases, then the fuzzy DLs ALC under Lukasiewicz and Product fuzzy logics do not verify the FMP even if we restrict to witnessed models; in other words, finite satisfiability and witnessed satisfiability are different for arbitrary knowledge bases. The aim of this paper is to point out the failure of FMP because it affects several algorithms published in the literature for reasoning under fuzzy ALC.
What does Newcomb's paradox teach us?
Wolpert, David H., Benford, Gregory
In Newcomb's paradox you choose to receive either the contents of a particular closed box, or the contents of both that closed box and another one. Before you choose, a prediction algorithm deduces your choice, and fills the two boxes based on that deduction. Newcomb's paradox is that game theory appears to provide two conflicting recommendations for what choice you should make in this scenario. We analyze Newcomb's paradox using a recent extension of game theory in which the players set conditional probability distributions in a Bayes net. We show that the two game theory recommendations in Newcomb's scenario have different presumptions for what Bayes net relates your choice and the algorithm's prediction. We resolve the paradox by proving that these two Bayes nets are incompatible. We also show that the accuracy of the algorithm's prediction, the focus of much previous work, is irrelevant. In addition we show that Newcomb's scenario only provides a contradiction between game theory's expected utility and dominance principles if one is sloppy in specifying the underlying Bayes net. We also show that Newcomb's paradox is time-reversal invariant; both the paradox and its resolution are unchanged if the algorithm makes its `prediction' after you make your choice rather than before.