AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

A Note on Posterior Probability Estimation for Classifiers

Nalbantov, Georgi, Ivanov, Svetoslav

arXiv.org Machine LearningSep-12-2019

One of the central themes in the classification task is the estimation of class posterior probability at a new point $\bf{x}$. The vast majority of classifiers output a score for $\bf{x}$, which is monotonically related to the posterior probability via an unknown relationship. There are many attempts in the literature to estimate this latter relationship. Here, we provide a way to estimate the posterior probability without resorting to using classification scores. Instead, we vary the prior probabilities of classes in order to derive the ratio of pdf's at point $\bf{x}$, which is directly used to determine class posterior probabilities. We consider here the binary classification problem.

artificial intelligence, bayesian inference, machine learning, (15 more...)

arXiv.org Machine Learning

1909.05894

Country: North America > United States > Virginia (0.14)

Genre: Research Report (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.31)

Add feedback

Regularized Estimation and Feature Selection in Mixtures of Gaussian-Gated Experts Models

Chamroukhi, Faïcel, Lecocq, Florian, Nguyen, Hien D.

arXiv.org Machine LearningSep-12-2019

Mixtures-of-Experts models and their maximum likelihood estimation (MLE) via the EM algorithm have been thoroughly studied in the statistics and machine learning literature. They are subject of a growing investigation in the context of modeling with high-dimensional predictors with regularized MLE. We examine MoE with Gaussian gating network, for clustering and regression, and propose an $\ell_1$-regularized MLE to encourage sparse models and deal with the high-dimensional setting. We develop an EM-Lasso algorithm to perform parameter estimation and utilize a BIC-like criterion to select the model parameters, including the sparsity tuning hyperparameters. Experiments conducted on simulated data show the good performance of the proposed regularized MLE compared to the standard MLE with the EM algorithm.

artificial intelligence, bayesian inference, machine learning, (2 more...)

arXiv.org Machine Learning

1909.05494

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.73)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.53)

Add feedback

Modular Meta-Learning with Shrinkage

Chen, Yutian, Friesen, Abram L., Behbahani, Feryal, Budden, David, Hoffman, Matthew W., Doucet, Arnaud, de Freitas, Nando

arXiv.org Artificial IntelligenceSep-12-2019

Most gradient-based approaches to meta-learning do not explicitly account for the fact that different parts of the underlying model adapt by different amounts when applied to a new task. For example, the input layers of an image classification convnet typically adapt very little, while the output layers can change significantly. This can cause parts of the model to begin to overfit while others underfit. To address this, we introduce a hierarchical Bayesian model with per-module shrinkage parameters, which we propose to learn by maximizing an approximation of the predictive likelihood using implicit differentiation. Our algorithm subsumes Reptile and outperforms variants of MAML on two synthetic few-shot meta-learning problems.

artificial intelligence, international conference, machine learning, (15 more...)

arXiv.org Artificial Intelligence

1909.05557

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Add feedback

Maximum Likelihood Constraint Inference for Inverse Reinforcement Learning

Scobee, Dexter R. R., Sastry, S. Shankar

arXiv.org Artificial IntelligenceSep-12-2019

While most approaches to the problem of Inverse Reinforcement Learning (IRL) focus on estimating a reward function that best explains an expert agent's policy or demonstrated behavior on a control task, it is often the case that such behavior is more succinctly described by a simple reward combined with a set of hard constraints. In this setting, the agent is attempting to maximize cumulative rewards subject to these given constraints on their behavior. We reformulate the problem of IRL on Markov Decision Processes (MDPs) such that, given a nominal model of the environment and a nominal reward function, we seek to estimate state, action, and feature constraints in the environment that motivate an agent's behavior. Our approach is based on the Maximum Entropy IRL framework, which allows us to reason about the likelihood of an expert agent's demonstrations given our knowledge of an MDP. Using our method, we can infer which constraints can be added to the MDP to most increase the likelihood of observing these demonstrations. We present an algorithm which iteratively infers the Maximum Likelihood Constraint to best explain observed behavior, and we evaluate its efficacy using both simulated behavior and recorded data of humans navigating around an obstacle.

constraint, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

1909.05477

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.61)

Add feedback

5 Reasons to Learn Probability for Machine Learning

#artificialintelligenceSep-11-2019, 00:52:30 GMT

Probability is a field of mathematics that quantifies uncertainty. It is undeniably a pillar of the field of machine learning, and many recommend it as a prerequisite subject to study prior to getting started. This is misleading advice, as probability makes more sense to a practitioner once they have the context of the applied machine learning process in which to interpret it. In this post, you will discover why machine learning practitioners should study probabilities to improve their skills and capabilities. Before we go through the reasons that you should learn probability, let's start off by taking a small look at the reason why you should not.

artificial intelligence, bayesian inference, machine learning, (12 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

An Online Reinforcement Learning Approach to Quality-Cost-Aware Task Allocation for Multi-Attribute Social Sensing

Zhang, Yang, Zhang, Daniel, Vance, Nathan, Wang, Dong

arXiv.org Machine LearningSep-11-2019

Social sensing has emerged as a new sensing paradigm where humans (or devices on their behalf) collectively report measurements about the physical world. This paper focuses on a quality-cost-aware task allocation problem in multi-attribute social sensing applications. The goal is to identify a task allocation strategy (i.e., decide when and where to collect sensing data) to achieve an optimized tradeoff between the data quality and the sensing cost. While recent progress has been made to tackle similar problems, three important challenges have not been well addressed: (i) "online task allocation": the task allocation schemes need to respond quickly to the potentially large dynamics of the measured variables in social sensing; (ii) "multi-attribute constrained optimization": minimizing the overall sensing error given the dependencies and constraints of multiple attributes of the measured variables is a nontrivial problem to solve; (iii) "nonuniform task allocation cost": the task allocation cost in social sensing often has a nonuniform distribution which adds additional complexity to the optimized task allocation problem. We evaluate the QCO-TA scheme through a real-world social sensing application and the results show that our scheme significantly outperforms the state-of-the-art baselines in terms of both sensing accuracy and cost. Introduction This paper presents an online reinforcement learning framework to solve the quality-cost-aware task allocation problem in multi-attribute social sensing applications. Social sensing has emerged as a new sensing paradigm in pervasive and mobile computing applications where humans (or devices on their behalf) collectively report measurements about the physical world [1, 2]. Examples of social sensing applications include air quality and environment monitoring in smart cities using mobile devices [3], malfunctioning urban infrastructures reporting using geotagging [4], and damage assessment in disaster response using online social media [5]. In social sensing applications, participants perform sensing tasks at assigned locations to collect different attributes of the measured variables that are of interests to the application [6]. For example, in an urban air quality sensing application, participants are tasked to measure various air quality attributes (e.g., PM 2.5, PM 10, CO 2) at different locations of the city to estimate the overall air quality and identity potential health risks. We refer to this category of applications as multi-attribute social sensing applications . In multi-attribute social sensing applications, there exists a fundamental tradeoff between data quality and sensing (task allocation) cost [3, 7]. 2 In particular, it is essential to obtain comprehensive and accurate measurements to ensure the desired data quality of the social sensing applications.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

1909.05388

Country: North America (0.28)

Genre: Research Report > New Finding (0.48)

Industry: Transportation (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Correcting Predictions for Approximate Bayesian Inference

Kuśmierczyk, Tomasz, Sakaya, Joseph, Klami, Arto

arXiv.org Machine LearningSep-11-2019

Bayesian models quantify uncertainty and facilitate optimal decision-making in downstream applications. For most models, however, practitioners are forced to use approximate inference techniques that lead to sub-optimal decisions due to incorrect posterior predictive distributions. We present a novel approach that corrects for inaccuracies in posterior inference by altering the decision-making process. We train a separate model to make optimal decisions under the approximate posterior, combining interpretable Bayesian modeling with optimization of direct predictive accuracy in a principled fashion. The solution is generally applicable as a plug-in module for predictive decision-making for arbitrary probabilistic programs, irrespective of the posterior inference strategy. We demonstrate the approach empirically in several problems, confirming its potential.

approximation, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

1909.04919

Country: Europe > Finland (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Efficient Bayesian synthetic likelihood with whitening transformations

Priddle, Jacob W., Sisson, Scott A., Drovandi, Christopher

arXiv.org Machine LearningSep-11-2019

Likelihood-free methods are an established approach for performing approximate Bayesian inference for models with intractable likelihood functions. However, they can be computationally demanding. Bayesian synthetic likelihood (BSL) is a popular such method that approximates the likelihood function of the summary statistic with a known, tractable distribution -- typically Gaussian -- and then performs statistical inference using standard likelihood-based techniques. However, as the number of summary statistics grows, the number of model simulations required to accurately estimate the covariance matrix for this likelihood rapidly increases. This poses significant challenge for the application of BSL, especially in cases where model simulation is expensive. In this article we propose whitening BSL (wBSL) -- an efficient BSL method that uses approximate whitening transformations to decorrelate the summary statistics at each algorithm iteration. We show empirically that this can reduce the number of model simulations required to implement BSL by more than an order of magnitude, without much loss of accuracy. We explore a range of whitening procedures and demonstrate the performance of wBSL on a range of simulated and real modelling scenarios from ecology and biology.

artificial intelligence, machine learning, model simulation, (16 more...)

arXiv.org Machine Learning

1909.04857

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)

Add feedback

Correlation Priors for Reinforcement Learning

Alt, Bastian, Šošić, Adrian, Koeppl, Heinz

arXiv.org Artificial IntelligenceSep-11-2019

Many decision-making problems naturally exhibit pronounced structures inherited from the underlying characteristics of the environment. In a Markov decision process model, for example, two distinct states can have inherently related semantics or encode resembling physical state configurations, often implying locally correlated transition dynamics among the states. In order to complete a certain task, an agent acting in such environments needs to execute a series of temporally and spatially correlated actions. Though there exists a variety of approaches to account for correlations in continuous state-action domains, a principled solution for discrete environments is missing. In this work, we present a Bayesian learning framework based on P\'olya-Gamma augmentation that enables an analogous reasoning in such cases. We demonstrate the framework on a number of common decision-making related tasks, such as reinforcement learning, imitation learning and system identification. By explicitly modeling the underlying correlation structures, the proposed approach yields superior predictive performance compared to correlation-agnostic models, even when trained on data sets that are up to an order of magnitude smaller in size.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

1909.05106

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)

Add feedback

Scalable Structure Learning of Continuous-Time Bayesian Networks from Incomplete Data

Linzner, Dominik, Schmidt, Michael, Koeppl, Heinz

arXiv.org Machine LearningSep-10-2019

Continuous-time Bayesian Networks (CTBNs) represent a compact yet powerful framework for understanding multivariate time-series data. Given complete data, parameters and structure can be estimated efficiently in closed-form. However, if data is incomplete, the latent states of the CTBN have to be estimated by laboriously simulating the intractable dynamics of the assumed CTBN. This is a problem, especially for structure learning tasks, where this has to be done for each element of super-exponentially growing set of possible structures. In order to circumvent this notorious bottleneck, we develop a novel gradient-based approach to structure learning. Instead of sampling and scoring all possible structures individually, we assume the generator of the CTBN to be composed as a mixture of generators stemming from different structures. In this framework, structure learning can be performed via a gradient-based optimization of mixture weights. We combine this approach with a novel variational method that allows for the calculation of the marginal likelihood of a mixture in closed-form. We proof the scalability of our method by learning structures of previously inaccessible sizes from synthetic and real-world data.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

1909.0457

Genre: Research Report (0.40)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.86)

Add feedback