AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

TRAMP: Compositional Inference with TRee Approximate Message Passing

Baker, Antoine, Aubin, Benjamin, Krzakala, Florent, Zdeborová, Lenka

arXiv.org Machine LearningApr-3-2020

We introduce tramp, standing for TRee Approximate Message Passing, a python package for compositional inference in high-dimensional tree-structured models. The package provides an unifying framework to study several approximate message passing algorithms previously derived for a variety of machine learning tasks such as generalized linear models, inference in multi-layer networks, matrix factorization, and reconstruction using non-separable penalties. For some models, the asymptotic performance of the algorithm can be theoretically predicted by the state evolution, and the measurements entropy estimated by the free entropy formalism. The implementation is modular by design: each module, which implements a factor, can be composed at will with other modules to solve complex inference tasks. The user only needs to declare the factor graph of the model: the inference algorithm, state evolution and entropy estimation are fully automated.

algorithm, state evolution, variance, (14 more...)

arXiv.org Machine Learning

2004.01571

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)

Add feedback

Neural Conditional Event Time Models

Engelhard, Matthew, Berchuck, Samuel, D'Arcy, Joshua, Henao, Ricardo

arXiv.org Machine LearningApr-3-2020

Event time models predict occurrence times of an event of interest based on known features. Recent work has demonstrated that neural networks achieve state-of-the-art event time predictions in a variety of settings. However, standard event time models suppose that the event occurs, eventually, in all cases. Consequently, no distinction is made between a) the probability of event occurrence, and b) the predicted time of occurrence. This distinction is critical when predicting medical diagnoses, equipment defects, social media posts, and other events that or may not occur, and for which the features affecting a) may be different from those affecting b). In this work, we develop a conditional event time model that distinguishes between these components, implement it as a neural network with a binary stochastic layer representing finite event occurrence, and show how it may be learned from right-censored event times via maximum likelihood estimation. Results demonstrate superior event occurrence and event time predictions on synthetic data, medical events (MIMIC-III), and social media posts (Reddit), comprising 21 total prediction tasks.

event time, prediction, time model, (14 more...)

arXiv.org Machine Learning

2004.01376

Country: Asia > Middle East > Israel (0.04)

Genre:

Research Report > New Finding (0.88)
Research Report > Experimental Study (0.68)

Industry:

Health & Medicine > Therapeutic Area (0.68)
Health & Medicine > Diagnostic Medicine (0.66)
Health & Medicine > Health Care Providers & Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Sum-product networks: A survey

París, Iago, Sánchez-Cauce, Raquel, Díez, Francisco Javier

arXiv.org Artificial IntelligenceApr-2-2020

A sum-product network (SPN) is a probabilistic model, based on a rooted acyclic directed graph, in which terminal nodes represent univariate probability distributions and non-terminal nodes represent convex combinations (weighted sums) and products of probability functions. They are closely related to probabilistic graphical models, in particular to Bayesian networks with multiple context-specific independencies. Their main advantage is the possibility of building tractable models from data, i.e., models that can perform several inference tasks in time proportional to the number of links in the graph. They are somewhat similar to neural networks and can address the same kinds of problems, such as image processing and natural language understanding. This paper offers a survey of SPNs, including their definition, the main algorithms for inference and learning from data, the main applications, a brief review of software libraries, and a comparison with related models

algorithm, node, spn, (15 more...)

arXiv.org Artificial Intelligence

2004.01167

Country:

Europe > Spain > Galicia > Madrid (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Government (0.45)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(4 more...)

Add feedback

Deep transformation models: Tackling complex regression problems with neural network based transformation models

Sick, Beate, Hothorn, Torsten, Dürr, Oliver

arXiv.org Machine LearningApr-1-2020

We present a deep transformation model for probabilistic regression. Deep learning is known for outstandingly accurate predictions on complex data but in regression tasks, it is predominantly used to just predict a single number. This ignores the non-deterministic character of most tasks. Especially if crucial decisions are based on the predictions, like in medical applications, it is essential to quantify the prediction uncertainty. The presented deep learning transformation model estimates the whole conditional probability distribution, which is the most thorough way to capture uncertainty about the outcome. We combine ideas from a statistical transformation model (most likely transformation) with recent transformation models from deep learning (normalizing flows) to predict complex outcome distributions. The core of the method is a parameterized transformation function which can be trained with the usual maximum likelihood framework using gradient descent. The method can be combined with existing deep learning architectures. For small machine learning benchmark datasets, we report state of the art performance for most dataset and partly even outperform it. Our method works for complex input data, which we demonstrate by employing a CNN architecture on image data.

artificial intelligence, machine learning, transformation model, (19 more...)

arXiv.org Machine Learning

2004.00464

Country: Europe > Switzerland > Zürich > Zürich (0.05)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)

Add feedback

Bayesian ODE Solvers: The Maximum A Posteriori Estimate

Tronarp, Filip, Sarkka, Simo, Hennig, Philipp

arXiv.org Machine LearningApr-1-2020

It has recently been established that the numerical solution of ordinary differential equations can be posed as a nonlinear Bayesian inference problem, which can be approximately solved via Gaussian filtering and smoothing, whenever a Gauss--Markov prior is used. In this paper the class of $\nu$ times differentiable linear time invariant Gauss--Markov priors is considered. A taxonomy of Gaussian estimators is established, with the maximum a posteriori estimate at the top of the hierarchy, which can be computed with the iterated extended Kalman smoother. The remaining three classes are termed explicit, semi-implicit, and implicit, which are in similarity with the classical notions corresponding to conditions on the vector field, under which the filter update produces a local maximum a posteriori estimate. The maximum a posteriori estimate corresponds to an optimal interpolant in the reproducing Hilbert space associated with the prior, which in the present case is equivalent to a Sobolev space of smoothness $\nu+1$. Consequently, using methods from scattered data approximation and nonlinear analysis in Sobolev spaces, it is shown that the maximum a posteriori estimate converges to the true solution at a polynomial rate in the fill-distance (maximum step size) subject to mild conditions on the vector field. The methodology developed provides a novel and more natural approach to study the convergence of these estimators than classical methods of convergence analysis. The methods and theoretical results are demonstrated in numerical examples.

differential equation, solver, vector field, (14 more...)

arXiv.org Machine Learning

2004.00623

Country:

Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
(9 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

SUMO: Unbiased Estimation of Log Marginal Probability for Latent Variable Models

Luo, Yucen, Beatson, Alex, Norouzi, Mohammad, Zhu, Jun, Duvenaud, David, Adams, Ryan P., Chen, Ricky T. Q.

arXiv.org Machine LearningApr-1-2020

Standard variational lower bounds used to train latent variable models produce biased estimates of most quantities of interest. We introduce an unbiased estimator of the log marginal likelihood and its gradients for latent variable models based on randomized truncation of infinite series. If parameterized by an encoder-decoder architecture, the parameters of the encoder can be optimized to minimize its variance of this estimator. We show that models trained using our estimator give better test-set likelihoods than a standard importance-sampling based approach for the same average computational cost. This estimator also allows use of latent variable models for tasks where unbiased estimators, rather than marginal likelihood lower bounds, are preferred, such as minimizing reverse KL divergences and estimating score functions.

estimator, international conference, sumo, (15 more...)

arXiv.org Machine Learning

2004.00353

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > Los Angeles County > Santa Monica (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Mining International Political Norms from the GDELT Database

Murali, Rohit, Patnaik, Suravi, Cranefield, Stephen

arXiv.org Artificial IntelligenceMar-31-2020

Researchers have long been interested in the role that norms can play in governing agent actions in multi-agent systems. Much work has been done on formalising normative concepts from human society and adapting them for the government of open software systems, and on the simulation of normative processes in human and artificial societies. However, there has been comparatively little work on applying normative MAS mechanisms to understanding the norms in human society. This work investigates this issue in the context of international politics. Using the GDELT dataset, containing machine-encoded records of international events extracted from news reports, we extracted bilateral sequences of inter-country events and applied a Bayesian norm mining mechanism to identify norms that best explained the observed behaviour. A statistical evaluation showed that the normative model fitted the data significantly better than a probabilistic discrete event model.

hypothesis, probability, sequence, (14 more...)

arXiv.org Artificial Intelligence

2003.14027

Country:

Oceania > New Zealand > South Island > Otago > Dunedin (0.04)
North America > United States > Pennsylvania (0.04)
Europe > Albania (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Government > Foreign Policy (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Exact marginal inference in Latent Dirichlet Allocation

Maennel, Hartmut

arXiv.org Machine LearningMar-31-2020

Assume we have potential "causes" $z\in Z$, which produce "events" $w$ with known probabilities $\beta(w|z)$. We observe $w_1,w_2,...,w_n$, what can we say about the distribution of the causes? A Bayesian estimate will assume a prior on distributions on $Z$ (we assume a Dirichlet prior) and calculate a posterior. An average over that posterior then gives a distribution on $Z$, which estimates how much each cause $z$ contributed to our observations. This is the setting of Latent Dirichlet Allocation, which can be applied e.g. to topics "producing" words in a document. In this setting usually the number of observed words is large, but the number of potential topics is small. We are here interested in applications with many potential "causes" (e.g. locations on the globe), but only a few observations. We show that the exact Bayesian estimate can be computed in linear time (and constant space) in $|Z|$ for a given upper bound on $n$ with a surprisingly simple formula. We generalize this algorithm to the case of sparse probabilities $\beta(w|z)$, in which we only need to assume that the tree width of an "interaction graph" on the observations is limited. On the other hand we also show that without such limitation the problem is NP-hard.

algorithm, polynomial, power series, (14 more...)

arXiv.org Machine Learning

2004.00115

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.61)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.61)

Add feedback

Learning from Small Data Through Sampling an Implicit Conditional Generative Latent Optimization Model

Azuri, Idan, Weinshall, Daphna

arXiv.org Machine LearningMar-31-2020

We revisit the long-standing problem of \emph{learning from small sample}. In recent years major efforts have been invested into the generation of new samples from a small set of training data points. Some use classical transformations, others synthesize new examples. Our approach belongs to the second one. We propose a new model based on conditional Generative Latent Optimization (cGLO). Our model learns to synthesize completely new samples for every class just by interpolating between samples in the latent space. The proposed method samples the learned latent space using spherical interpolations (\emph{slerp}) and generates a new sample using the trained generator. Our empirical results show that the new sampled set is diverse enough, leading to improvement in image classification in comparison to the state of the art, when trained on small samples of CIFAR-100 and CUB-200.

augmentation, classification, latent space, (15 more...)

arXiv.org Machine Learning

2003.14297

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > Ontario > Toronto (0.04)
(3 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Flows for simultaneous manifold learning and density estimation

Brehmer, Johann, Cranmer, Kyle

arXiv.org Machine LearningMar-30-2020

We introduce manifold-modeling flows (MFMFs), a new class of generative models that simultaneously learn the data manifold as well as a tractable probability density on that manifold. Combining aspects of normalizing flows, GANs, autoencoders, and energy-based models, they have the potential to represent data sets with a manifold structure more faithfully and provide handles on dimensionality reduction, denoising, and out-of-distribution detection. We argue why such models should not be trained by maximum likelihood alone and present a new training algorithm that separates manifold and density updates. With two pedagogical examples we demonstrate how manifold-modeling flows let us learn the data manifold and allow for better inference than standard flows in the ambient data space.

latexit latexit sha1, likelihood, manifold, (16 more...)

arXiv.org Machine Learning

2003.13913

Country:

North America > United States > New York (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (1.00)

Industry: Education (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback