AITopics | Bayesian Learning

Collaborating Authors

Bayesian Learning

A Bayesian network, Bayes network, belief network, Bayes(ian) model or probabilistic directed acyclic graphical model is a probabilistic graphical model (a type of statistical model) that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Global Sensitivity Analysis for MAP Inference in Graphical Models

Neural Information Processing SystemsMar-13-2024, 06:00:29 GMT

We study the sensitivity of a MAP configuration of a discrete probabilistic graphical model with respect to perturbations of its parameters. These perturbations are global, in the sense that simultaneous perturbations of all the parameters (or any chosen subset of them) are allowed. Our main contribution is an exact algorithm that can check whether the MAP configuration is robust with respect to given perturbations. Its complexity is essentially the same as that of obtaining the MAP configuration itself, so it can be promptly used with minimal effort. We use our algorithm to identify the largest global perturbation that does not induce a change in the MAP configuration, and we successfully apply this robustness measure in two practical scenarios: the prediction of facial action units with posed images and the classification of multiple real public data sets. A strong correlation between the proposed robustness measure and accuracy is verified in both scenarios.

instantiation, map instantiation, perturbation, (15 more...)

Neural Information Processing Systems

Country:

South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Learning Large-Scale Poisson DAG Models based on OverDispersion Scoring

Neural Information Processing SystemsMar-13-2024, 05:32:10 GMT

In this paper, we address the question of identifiability and learning algorithms for large-scale Poisson Directed Acyclic Graphical (DAG) models. We define general Poisson DAG models as models where each node is a Poisson random variable with rate parameter depending on the values of the parents in the underlying DAG. First, we prove that Poisson DAG models are identifiable from observational data, and present a polynomial-time algorithm that learns the Poisson DAG model under suitable regularity conditions. The main idea behind our algorithm is based on overdispersion, in that variables that are conditionally Poisson are overdispersed relative to variables that are marginally Poisson.

algorithm, causal, dag model, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > United States > Florida > Monroe County > Key West (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.95)

Add feedback

The Brain Uses Reliability of Stimulus Information when Making Perceptual Decisions 1

Neural Information Processing SystemsMar-13-2024, 05:30:50 GMT

In simple perceptual decisions the brain has to identify a stimulus based on noisy sensory samples from the stimulus. Basic statistical considerations state that the reliability of the stimulus information, i.e., the amount of noise in the samples, should be taken into account when the decision is made. However, for perceptual decision making experiments it has been questioned whether the brain indeed uses the reliability for making decisions when confronted with unpredictable changes in stimulus reliability. We here show that even the basic drift diffusion model, which has frequently been used to explain experimental findings in perceptual decision making, implicitly relies on estimates of stimulus reliability. We then show that only those variants of the drift diffusion model which allow stimulusspecific reliabilities are consistent with neurophysiological findings. Our analysis suggests that the brain estimates the reliability of the stimulus on a short time scale of at most a few hundred milliseconds.

coherence, reliability, variant, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Germany > Saxony > Dresden (0.04)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Discrete Rényi Classifiers

Neural Information Processing SystemsMar-13-2024, 05:15:50 GMT

When the probability distribution P(X, Y) is known, the optimal classifier, leading to the minimum misclassification rate, is given by the Maximum A-posteriori Probability (MAP) decision rule. However, in practice, estimating the complete joint distribution P(X, Y) is computationally and statistically impossible for large values of d. Therefore, an alternative approach is to first estimate some low order marginals of the joint probability distribution P(X, Y) and then design the classifier based on the estimated low order marginals. This approach is also helpful when the complete training data instances are not available due to privacy concerns. In this work, we consider the problem of finding the optimum classifier based on some estimated low order marginals of (X, Y).

classifier, decision rule, misclassification rate, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.05)
Asia > Middle East > Jordan (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

The Poisson Gamma Belief Network

Neural Information Processing SystemsMar-13-2024, 05:01:27 GMT

To infer a multilayer representation of high-dimensional count vectors, we propose the Poisson gamma belief network (PGBN) that factorizes each of its layers into the product of a connection weight matrix and the nonnegative real hidden units of the next layer. The PGBN's hidden layers are jointly trained with an upward-downward Gibbs sampler, each iteration of which upward samples Dirichlet distributed connection weight vectors starting from the first layer (bottom data layer), and then downward samples gamma distributed hidden units starting from the top hidden layer. The gamma-negative binomial process combined with a layer-wise training strategy allows the PGBN to infer the width of each layer given a fixed budget on the width of the first layer. The PGBN with a single hidden layer reduces to Poisson factor analysis. Example results on text analysis illustrate interesting relationships between the width of the first layer and the inferred network structure, and demonstrate that the PGBN, whose hidden units are imposed with correlated gamma priors, can add more layers to increase its performance gains over Poisson factor analysis, given the same limit on the width of the first layer.

belief network, pgbn, upward-downward gibbs sampler, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.14)
Asia > Middle East > Jordan (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Variational Consensus Monte Carlo

Neural Information Processing SystemsMar-13-2024, 04:44:33 GMT

Practitioners of Bayesian statistics have long depended on Markov chain Monte Carlo (MCMC) to obtain samples from intractable posterior distributions. Unfortunately, MCMC algorithms are typically serial, and do not scale to the large datasets typical of modern machine learning. The recently proposed consensus Monte Carlo algorithm removes this limitation by partitioning the data and drawing samples conditional on each partition in parallel [22]. A fixed aggregation function then combines these samples, yielding approximate posterior samples. We introduce variational consensus Monte Carlo (VCMC), a variational Bayes algorithm that optimizes over aggregation functions to obtain samples from a distribution that better approximates the target. The resulting objective contains an intractable entropy term; we therefore derive a relaxation of the objective and show that the relaxed problem is blockwise concave under mild conditions. We illustrate the advantages of our algorithm on three inference tasks from the literature, demonstrating both the superior quality of the posterior approximation and the moderate overhead of the optimization step. Our algorithm achieves a relative error reduction (measured against serial MCMC) of up to 39% compared to consensus Monte Carlo on the task of estimating 300-dimensional probit regression parameter expectations; similarly, it achieves an error reduction of 92% on the task of estimating cluster comembership probabilities in a Gaussian mixture model with 8 components in 8 dimensions. Furthermore, these gains come at moderate cost compared to the runtime of serial MCMC--achieving near-ideal speedup in some instances.

artificial intelligence, bayesian inference, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Asia > Middle East > Jordan (0.04)
South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Add feedback

Rate-Agnostic (Causal) Structure Learning David Danks The Mind Research Network, Carnegie-Mellon University Albuquerque, NM

Neural Information Processing SystemsMar-13-2024, 04:30:20 GMT

Causal structure learning from time series data is a major scientific challenge. Extant algorithms assume that measurements occur sufficiently quickly; more precisely, they assume approximately equal system and measurement timescales. In many domains, however, measurements occur at a significantly slower rate than the underlying system changes, but the size of the timescale mismatch is often unknown. This paper develops three causal structure learning algorithms, each of which discovers all dynamic causal graphs that explain the observed measurement data, perhaps given undersampling. That is, these algorithms all learn causal structure in a "rate-agnostic" manner: they do not assume any particular relation between the measurement and system timescales. We apply these algorithms to data from simulations to gain insight into the challenge of undersampling.

algorithm, causal structure, graph, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > New Mexico > Bernalillo County > Albuquerque (0.41)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Oregon > Benton County > Corvallis (0.04)
(2 more...)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

Gradient Estimation Using Stochastic Computation Graphs

Neural Information Processing SystemsMar-13-2024, 04:15:37 GMT

In a variety of problems originating in supervised, unsupervised, and reinforcement learning, the loss function is defined by an expectation over a collection of random variables, which might be part of a probabilistic model or the external world. Estimating the gradient of this loss function, using samples, lies at the core of gradient-based learning algorithms for these problems. We introduce the formalism of stochastic computation graphs--directed acyclic graphs that include both deterministic functions and conditional probability distributions--and describe how to easily and automatically derive an unbiased estimator of the loss function's gradient. The resulting algorithm for computing the gradient estimator is a simple modification of the standard backpropagation algorithm. The generic scheme we propose unifies estimators derived in variety of prior work, along with variance-reduction techniques therein. It could assist researchers in developing intricate models involving a combination of stochastic and deterministic operations, enabling, for example, attention, memory, and control actions.

estimator, gradient estimator, stochastic computation graph, (11 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > New York (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)

Add feedback

Bayesian Active Model Selection with an Application to Automated Audiometry

Neural Information Processing SystemsMar-13-2024, 04:01:02 GMT

We introduce a novel information-theoretic approach for active model selection and demonstrate its effectiveness in a real-world application.

approximation, model selection, selection, (17 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Missouri > St. Louis County > St. Louis (0.05)
North America > United States > New York > Tompkins County > Ithaca (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.46)

Industry: Health & Medicine > Therapeutic Area > Otolaryngology (0.31)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.65)

Add feedback

Max-Margin Majority Voting for Learning from Crowds

Neural Information Processing SystemsMar-13-2024, 03:59:50 GMT

Learning-from-crowds aims to design proper aggregation strategies to infer the unknown true labels from the noisy labels provided by ordinary web workers.

constraint, estimator, majority voting, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Add feedback