AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

Copula Bayesian Networks

Elidan, Gal

Neural Information Processing SystemsFeb-15-2020, 00:56:39 GMT

We present the Copula Bayesian Network model for representing multivariate continuous distributions. Our approach builds on a novel copula-based parameterization of a conditional density that, joined with a graph that encodes independencies, offers great flexibility in modeling high-dimensional densities, while maintaining control over the form of the univariate marginals. We demonstrate the advantage of our framework for generalization over standard Bayesian networks as well as tree structured copula models for varied real-life domains that are of substantially higher dimension than those typically considered in the copula literature. Papers published at the Neural Information Processing Systems Conference.

copula bayesian network

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

A Bayesian Approach to Concept Drift

Bach, Stephen, Maloof, Mark

Neural Information Processing SystemsFeb-15-2020, 00:41:31 GMT

To cope with concept drift, we placed a probability distribution over the location of the most-recent drift point. We used Bayesian model comparison to update this distribution from the predictions of models trained on blocks of consecutive observations and pruned potential drift points with low probability. We compare our approach to a non-probabilistic method for drift and a probabilistic method for change-point detection. In our experiments, our approach generally yielded improved accuracy and/or speed over these other methods. Papers published at the Neural Information Processing Systems Conference.

bayesian approach, concept drift, drift point

Neural Information Processing Systems

Genre: Research Report (0.73)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.79)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.79)

Add feedback

Fast Bayesian Inference for Non-Conjugate Gaussian Process Regression

Khan, Emtiyaz, Mohamed, Shakir, Murphy, Kevin P.

Neural Information Processing SystemsFeb-15-2020, 00:27:29 GMT

We present a new variational inference algorithm for Gaussian processes with non-conjugate likelihood functions. This includes binary and multi-class classification, as well as ordinal regression. Our method constructs a convex lower bound, which can be optimized by using an efficient fixed point update method. We then show empirically that our new approach is much faster than existing methods without any degradation in performance. Papers published at the Neural Information Processing Systems Conference.

fast bayesian inference, non-conjugate gaussian process regression

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.40)

Add feedback

Automated Refinement of Bayes Networks' Parameters based on Test Ordering Constraints

Khan, Omar Z., Poupart, Pascal, Agosta, John-mark M.

Neural Information Processing SystemsFeb-15-2020, 00:11:09 GMT

In this paper, we derive a method to refine a Bayes network diagnostic model by exploiting constraints implied by expert decisions on test ordering. At each step, the expert executes an evidence gathering test, which suggests the test's relative diagnostic value. We demonstrate that consistency with an expert's test selection leads to non-convex constraints on the model parameters. We incorporate these constraints by augmenting the network with nodes that represent the constraint likelihoods. Gibbs sampling, stochastic hill climbing and greedy search algorithms are proposed to find a MAP estimate that takes into account test ordering constraints and any data available.

artificial intelligence, bayesian inference, machine learning, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)

Add feedback

Optimization-Based MCMC Methods for Nonlinear Hierarchical Statistical Inverse Problems

Bardsley, Johnathan, Cui, Tiangang

arXiv.org Machine LearningFeb-15-2020

In many hierarchical inverse problems, not only do we want to estimate high- or infinite-dimensional model parameters in the parameter-to-observable maps, but we also have to estimate hyperparameters that represent critical assumptions in the statistical and mathematical modeling processes. As a joint effect of high-dimensionality, nonlinear dependence, and non-concave structures in the joint posterior posterior distribution over model parameters and hyperparameters, solving inverse problems in the hierarchical Bayesian setting poses a significant computational challenge. In this work, we aim to develop scalable optimization-based Markov chain Monte Carlo (MCMC) methods for solving hierarchical Bayesian inverse problems with nonlinear parameter-to-observable maps and a broader class of hyperparameters. Our algorithmic development is based on the recently developed scalable randomize-then-optimize (RTO) method [4] for exploring the high- or infinite-dimensional model parameter space. By using RTO either as a proposal distribution in a Metropolis-within-Gibbs update or as a biasing distribution in the pseudo-marginal MCMC [2], we are able to design efficient sampling tools for hierarchical Bayesian inversion. In particular, the integration of RTO and the pseudo-marginal MCMC has sampling performance robust to model parameter dimensions. We also extend our methods to nonlinear inverse problems with Poisson-distributed measurements. Numerical examples in PDE-constrained inverse problems and positron emission tomography (PET) are used to demonstrate the performance of our methods.

bayesian inference, inverse problem, upstream oil & gas, (20 more...)

arXiv.org Machine Learning

2002.06358

Country: North America > United States > Montana > Missoula County > Missoula (0.14)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas > Upstream (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Add feedback

Posterior Ratio Estimation for Latent Variables

Zhang, Yulong, Yi, Mingxuan, Liu, Song, Kolar, Mladen

arXiv.org Machine LearningFeb-15-2020

Comparing the underlying distributions of two given datasets has been an important task in machine learning community and has a wide range of applications. For example, change detection algorithms Kawahara and Sugiyama ((2012)) compare datasets collected at different time points and report how the underlying distribution has shifted over time; Transfer learning algorithms Quionero-Candela et al. ((2009)) utilize the estimated differences between two datasets to efficiently share information between different tasks. Generative Adversarial Net (GAN) Goodfellow et al. ((2014)) learns an implicit generative model whose output minimizes the differences between an artificial dataset and a real dataset. Various computational methods have been proposed for comparing underlying distributions given two sets of observations. For example, Maximum Mean Discrepancy (MMD) Gretton et al. ((2012)) computes the distance between the kernel mean embeddings of two datasets in Reproducing Kernel Hilbert Space (RKHS).

dataset, likelihood function, probability, (12 more...)

arXiv.org Machine Learning

2002.0641

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > United Kingdom > England > Bristol (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Bayesian models for Large-scale Hierarchical Classification

Gopal, Siddharth, Yang, Yiming, Bai, Bing, Niculescu-mizil, Alexandru

Neural Information Processing SystemsFeb-14-2020, 23:56:40 GMT

A challenging problem in hierarchical classification is to leverage the hierarchical relations among classes for improving classification performance. An even greater challenge is to do so in a manner that is computationally feasible for the large scale problems usually encountered in practice. This paper proposes a set of Bayesian methods to model hierarchical dependencies among class labels using multivari- ate logistic regression. Specifically, the parent-child relationships are modeled by placing a hierarchical prior over the children nodes centered around the parame- ters of their parents; thereby encouraging classes nearby in the hierarchy to share similar model parameters. We present new, efficient variational algorithms for tractable posterior inference in these models, and provide a parallel implementa- tion that can comfortably handle large-scale problems with hundreds of thousands of dimensions and tens of thousands of classes.

bayesian inference, large-scale hierarchical classification, machine learning, (2 more...)

Neural Information Processing Systems

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.75)

Add feedback

Bayesian nonparametric models for bipartite graphs

Caron, Francois

Neural Information Processing SystemsFeb-14-2020, 23:27:15 GMT

We develop a novel Bayesian nonparametric model for random bipartite graphs. The model is based on the theory of completely random measures and is able to handle a potentially infinite number of nodes. We show that the model has appealing properties and in particular it may exhibit a power-law behavior. We derive a posterior characterization, an Indian Buffet-like generative process for network growth, and a simple and efficient Gibbs sampler for posterior simulation. Our model is shown to be well fitted to several real-world social networks.

bayesian nonparametric model, bipartite graph

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Add feedback

Bayesian estimation of discrete entropy with mixtures of stick-breaking priors

Archer, Evan, Park, Il Memming, Pillow, Jonathan W.

Neural Information Processing SystemsFeb-14-2020, 23:26:47 GMT

We consider the problem of estimating Shannon's entropy H in the under-sampled regime, where the number of possible symbols may be unknown or countably infinite. Pitman-Yor processes (a generalization of Dirichlet processes) provide tractable prior distributions over the space of countably infinite discrete distributions, and have found major applications in Bayesian non-parametric statistics and machine learning. Here we show that they also provide natural priors for Bayesian entropy estimation, due to the remarkable fact that the moments of the induced posterior distribution over H can be computed analytically. We derive formulas for the posterior mean (Bayes' least squares estimate) and variance under such priors. Moreover, we show that a fixed Dirichlet or Pitman-Yor process prior implies a narrow prior on H, meaning the prior strongly determines the entropy estimate in the under-sampled regime.

bayesian estimation, discrete entropy, under-sampled regime, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.40)

Add feedback

Bayesian Bias Mitigation for Crowdsourcing

Wauthier, Fabian L., Jordan, Michael I.

Neural Information Processing SystemsFeb-14-2020, 23:14:21 GMT

Biased labelers are a systemic problem in crowdsourcing, and a comprehensive toolbox for handling their responses is still being developed. A typical crowdsourcing application can be divided into three steps: data collection, data curation, and learning. At present these steps are often treated separately. We present Bayesian Bias Mitigation for Crowdsourcing (BBMC), a Bayesian model to unify all three. Most data curation methods account for the {\it effects} of labeler bias by modeling all labels as coming from a single latent truth.

bayesian bias mitigation, crowdsourcing, data collection, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.43)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.43)

Add feedback