Bayesian Inference
Machine Learning, Linear and Bayesian Models for Logistic Regression in Failure Detection Problems
In this work, we study the use of logistic regression in manufacturing failures detection. As a data set for the analysis, we used the data from Kaggle competition Bosch Production Line Performance. We considered the use of machine learning, linear and Bayesian models. For machine learning approach, we analyzed XGBoost tree based classifier to obtain high scored classification. Using the generalized linear model for logistic regression makes it possible to analyze the influence of the factors under study. The Bayesian approach for logistic regression gives the statistical distribution for the parameters of the model. It can be useful in the probabilistic analysis, e.g. risk assessment.
The Bayesian New Statistics: Hypothesis Testing, Estimation, Meta-Analysis, and Power Analysis from a Bayesian Perspective
Many people have found the table above to be useful for understanding two conceptual distinctions in the practice of data analysis. The article that discusses the table, and many other issues, is now in press. The in-press version can be found at OSF and at SSRN. Abstract: In the practice of data analysis, there is a conceptual distinction between hypothesis testing, on the one hand, and estimation with quantified uncertainty, on the other hand. Among frequentists in psychology a shift of emphasis from hypothesis testing to estimation has been dubbed "the New Statistics" (Cumming, 2014).
Priors on exchangeable directed graphs
Cai, Diana, Ackerman, Nathanael, Freer, Cameron
Directed graphs occur throughout statistical modeling of networks, and exchangeability is a natural assumption when the ordering of vertices does not matter. There is a deep structural theory for exchangeable undirected graphs, which extends to the directed case via measurable objects known as digraphons. Using digraphons, we first show how to construct models for exchangeable directed graphs, including special cases such as tournaments, linear orderings, directed acyclic graphs, and partial orderings. We then show how to construct priors on digraphons via the infinite relational digraphon model (di-IRM), a new Bayesian nonparametric block model for exchangeable directed graphs, and demonstrate inference on synthetic data.
Adversarial Message Passing For Graphical Models
Bayesian inference on structured models typically relies on the ability to infer posterior distributions of underlying hidden variables. However, inference in implicit models or complex posterior distributions is hard. A popular tool for learning implicit models are generative adversarial networks (GANs) which learn parameters of generators by fooling discriminators. Typically, GANs are considered to be models themselves and are not understood in the context of inference. Current techniques rely on inefficient global discrimination of joint distributions to perform learning, or only consider discriminating a single output variable. We overcome these limitations by treating GANs as a basis for likelihood-free inference in generative models and generalize them to Bayesian posterior inference over factor graphs. We propose local learning rules based on message passing minimizing a global divergence criterion involving cooperating local adversaries used to sidestep explicit likelihood evaluations. This allows us to compose models and yields a unified inference and learning framework for adversarial learning. Our framework treats model specification and inference separately and facilitates richly structured models within the family of Directed Acyclic Graphs, including components such as intractable likelihoods, non-differentiable models, simulators and generally cumbersome models. A key result of our treatment is the insight that Bayesian inference on structured models can be performed only with sampling and discrimination when using nonparametric variational families, without access to explicit distributions. As a side-result, we discuss the link to likelihood maximization. These approaches hold promise to be useful in the toolbox of probabilistic modelers and enrich the gamut of current probabilistic programming applications.
Scalable Group Level Probabilistic Sparse Factor Analysis
Hinrich, Jesper L., Nielsen, Sรธren F. V., Riis, Nicolai A. B., Eriksen, Casper T., Frรธsig, Jacob, Kristensen, Marco D. F., Schmidt, Mikkel N., Madsen, Kristoffer H., Mรธrup, Morten
Many data-driven approaches exist to extract neural representations of functional magnetic resonance imaging (fMRI) data, but most of them lack a proper probabilistic formulation. We propose a group level scalable probabilistic sparse factor analysis (psFA) allowing spatially sparse maps, component pruning using automatic relevance determination (ARD) and subject specific heteroscedastic spatial noise modeling. For task-based and resting state fMRI, we show that the sparsity constraint gives rise to components similar to those obtained by group independent component analysis. The noise modeling shows that noise is reduced in areas typically associated with activation by the experimental design. The psFA model identifies sparse components and the probabilistic setting provides a natural way to handle parameter uncertainties. The variational Bayesian framework easily extends to more complex noise models than the presently considered.
Towards Adaptive Training of Agent-based Sparring Partners for Fighter Pilots
Israelsen, Brett W., Ahmed, Nisar, Center, Kenneth, Green, Roderick, Bennett, Winston Jr
A key requirement for the current generation of artificial decision-makers is that they should adapt well to changes in unexpected situations. This paper addresses the situation in which an AI for aerial dog fighting, with tunable parameters that govern its behavior, must optimize behavior with respect to an objective function that is evaluated and learned through simulations. Bayesian optimization with a Gaussian Process surrogate is used as the method for investigating the objective function. One key benefit is that during optimization, the Gaussian Process learns a global estimate of the true objective function, with predicted outcomes and a statistical measure of confidence in areas that haven't been investigated yet. Having a model of the objective function is important for being able to understand possible outcomes in the decision space; for example this is crucial for training and providing feedback to human pilots. However, standard Bayesian optimization does not perform consistently or provide an accurate Gaussian Process surrogate function for highly volatile objective functions. We treat these problems by introducing a novel sampling technique called Hybrid Repeat/Multi-point Sampling. This technique gives the AI ability to learn optimum behaviors in a highly uncertain environment. More importantly, it not only improves the reliability of the optimization, but also creates a better model of the entire objective surface. With this improved model the agent is equipped to more accurately/efficiently predict performance in unexplored scenarios.
Stochastic Quasi-Newton Langevin Monte Carlo
ลimลekli, Umut, Badeau, Roland, Cemgil, A. Taylan, Richard, Gaรซl
Recently, Stochastic Gradient Markov Chain Monte Carlo (SG-MCMC) methods have been proposed for scaling up Monte Carlo computations to large data problems. Whilst these approaches have proven useful in many applications, vanilla SG-MCMC might suffer from poor mixing rates when random variables exhibit strong couplings under the target densities or big scale differences. In this study, we propose a novel SG-MCMC method that takes the local geometry into account by using ideas from Quasi-Newton optimization methods. These second order methods directly approximate the inverse Hessian by using a limited history of samples and their gradients. Our method uses dense approximations of the inverse Hessian while keeping the time and memory complexities linear with the dimension of the problem. We provide a formal theoretical analysis where we show that the proposed method is asymptotically unbiased and consistent with the posterior expectations. We illustrate the effectiveness of the approach on both synthetic and real datasets. Our experiments on two challenging applications show that our method achieves fast convergence rates similar to Riemannian approaches while at the same time having low computational requirements similar to diagonal preconditioning approaches.
Searching for the Master Algorithm - New Signature
It may sound trite, but humanity has come to dominate the world using this tool alone. Humans lack natural weapons, have no natural protection from the elements, and enter life as helpless infants. But our unique brains allow us to acquire, use, and communicate knowledge, and this advantage alone has allowed us to create the intricate social and technological reality we now inhabit. Our brains evolved to process, store, retrieve, and integrate sensory data into working knowledge that allows us to navigate reality. Until recently, humans were the only significant force that could translate raw data into accurate, actionable knowledge.
Bayes Theorem: A Visual Introduction For Beginners
From Google search results to Netflix recommendations and investment strategies, Bayes Theorem (also often called Bayes Rule or Bayes Formula) is used across countless industries to help calculate and assess probability. Bayesian statistics is taught in most first-year statistics classes across the nation, but there is one major problem that many students (and others who are interested in the theorem) face. The theorem is not intuitive for most people, and understanding how it works can be a challenge, especially because it is often taught without visual aids. In this guide, we unpack the various components of the theorem and provide a basic overview of how it works โ and with illustrations to help. Three scenarios โ the flu, breathalyzer tests, and peacekeeping โ are used throughout the booklet to teach how problems involving Bayes Theorem can be approached and solved.