AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

Operator Variational Inference

Ranganath, Rajesh, Tran, Dustin, Altosaar, Jaan, Blei, David

Neural Information Processing SystemsDec-31-2016

Variational inference is an umbrella term for algorithms which cast Bayesian inference as optimization. Classically, variational inference uses the Kullback-Leibler divergence to define the optimization. Though this divergence has been widely used, the resultant posterior approximation can suffer from undesirable statistical properties. To address this, we reexamine variational inference from its roots as an optimization problem. We use operators, or functions of functions, to design variational objectives. As one example, we design a variational objective with a Langevin-Stein operator. We develop a black box algorithm, operator variational inference (OPVI), for optimizing any operator objective. Importantly, operators enable us to make explicit the statistical and computational tradeoffs for variational inference. We can characterize different properties of variational objectives, such as objectives that admit data subsampling---allowing inference to scale to massive data---as well as objectives that admit variational programs---a rich class of posterior approximations that does not require a tractable density. We illustrate the benefits of OPVI on a mixture model and a generative model of images.

artificial intelligence, machine learning, objective, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)

Add feedback

A Unified Approach for Learning the Parameters of Sum-Product Networks

Zhao, Han, Poupart, Pascal, Gordon, Geoffrey J.

Neural Information Processing SystemsDec-31-2016

We present a unified approach for learning the parameters of Sum-Product networks (SPNs). We prove that any complete and decomposable SPN is equivalent to a mixture of trees where each tree corresponds to a product of univariate distributions. Based on the mixture model perspective, we characterize the objective function when learning SPNs based on the maximum likelihood estimation (MLE) principle and show that the optimization problem can be formulated as a signomial program. We construct two parameter learning algorithms for SPNs by using sequential monomial approximations (SMA) and the concave-convex procedure (CCCP), respectively. The two proposed methods naturally admit multiplicative updates, hence effectively avoiding the projection operation. With the help of the unified framework, we also show that, in the case of SPNs, CCCP leads to the same algorithm as Expectation Maximization (EM) despite the fact that they are different in general.

artificial intelligence, machine learning, spn, (18 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)

Add feedback

Scan Order in Gibbs Sampling: Models in Which it Matters and Bounds on How Much

He, Bryan D., Sa, Christopher M. De, Mitliagkas, Ioannis, Ré, Christopher

Neural Information Processing SystemsDec-31-2016

Gibbs sampling is a Markov Chain Monte Carlo sampling technique that iteratively samples variables from their conditional distributions. There are two common scan orders for the variables: random scan and systematic scan. Due to the benefits of locality in hardware, systematic scan is commonly used, even though most statistical guarantees are only for random scan. While it has been conjectured that the mixing times of random scan and systematic scan do not differ by more than a logarithmic factor, we show by counterexample that this is not the case, and we prove that that the mixing times do not differ by more than a polynomial factor under mild conditions. To prove these relative bounds, we introduce a method of augmenting the state space to study systematic scan using conductance.

artificial intelligence, machine learning, systematic scan, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.93)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.50)

Add feedback

A Very Short History Of Artificial Intelligence (AI)

#artificialintelligenceDec-30-2016, 14:45:04 GMT

By using this "Contrivance," "the most ignorant Person at a reasonable Charge, and with a little bodily Labour, may write Books in Philosophy, Poetry, Politicks, Law, Mathematicks, and Theology, with the least Assistance from Genius or study." Bayesian inference will become a leading approach in machine learning. The boat was equipped with, as Tesla described, "a borrowed mind." The word "robot" comes from the word "robota" (work). It features a robot double of a peasant girl, Maria, which unleashes chaos in Berlin of 2026--it was the first robot depicted on film, inspiring the Art Deco look of C-3PO in Star Wars.

artificial intelligence, bayesian inference, machine learning, (8 more...)

#artificialintelligence

Country:

North America > United States > New York (0.06)
Asia > Japan (0.06)

Industry:

Leisure & Entertainment > Games (0.37)
Media > Film (0.37)

Technology:

Information Technology > Artificial Intelligence > History (1.00)
Information Technology > Artificial Intelligence > Robots (0.85)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.37)
(2 more...)

Add feedback

Bayesian Learning of Dynamic Multilayer Networks

Durante, Daniele, Mukherjee, Nabanita, Steorts, Rebecca C.

arXiv.org Machine LearningDec-30-2016

A plethora of networks is being collected in a growing number of fields, including disease transmission, international relations, social interactions, and others. As data streams continue to grow, the complexity associated with these highly multidimensional connectivity data presents novel challenges. In this paper, we focus on the time-varying interconnections among a set of actors in multiple contexts, called layers. Current literature lacks flexible statistical models for dynamic multilayer networks, which can enhance quality in inference and prediction by efficiently borrowing information within each network, across time, and between layers. Motivated by this gap, we develop a Bayesian nonparametric model leveraging latent space representations. Our formulation characterizes the edge probabilities as a function of shared and layer-specific actors positions in a latent space, with these positions changing in time via Gaussian processes. This representation facilitates dimensionality reduction and incorporates different sources of information in the observed data. In addition, we obtain tractable procedures for posterior computation, inference, and prediction. We provide theoretical results on the flexibility of our model. Our methods are tested on simulations and infection studies monitoring dynamic face-to-face contacts among individuals in multiple days, where we perform better than current methods in inference and prediction.

information, prediction, probability, (15 more...)

arXiv.org Machine Learning

1608.02209

Country:

Africa > Kenya (0.04)
North America > United States > North Carolina > Durham County > Durham (0.04)
Europe > Italy (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Epidemiology (0.93)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.85)

Add feedback

High-dimensional Filtering using Nested Sequential Monte Carlo

Naesseth, Christian A., Lindsten, Fredrik, Schön, Thomas B.

arXiv.org Machine LearningDec-29-2016

Inference in complex and high-dimensional statistical models is a very challenging problem that is ubiquitous in applications such as climate informatics [Monteleoni et al., 2013], bioinformatics [Cohen, 2004] and machine learning [Wainwright and Jordan, 2008], to mention a few. We are interested in sequential Bayesian inference in settings where we have a sequence of posterior distributions that we need to compute. To be specific, we are focusing on settings where the model (or state variable) is high-dimensional, but where there are local dependencies. One example of the type of models we consider are the so-called spatiotemporal models [Wikle, 2015, Cressie and Wikle, 2011, Rue and Held, 2005]. Sequential Monte Carlo (SMC) methods comprise one of the most successful methodologies for sequential Bayesian inference. However, SMC struggles in high dimensions and these methods are rarely used for dimensions, say, higher than ten [Rebeschini and van Handel, 2015].

artificial intelligence, dx 1, machine learning, (13 more...)

arXiv.org Machine Learning

1612.09162

Country:

Europe (1.00)
North America > United States (0.67)
Asia > Middle East > Jordan (0.24)

Genre:

Research Report (0.64)
Workflow (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Partial Membership Latent Dirichlet Allocation

Chen, Chao, Zare, Alina, Trinh, Huy, Omotara, Gbeng, Cobb, J. Tory, Lagaunne, Timotius

arXiv.org Machine LearningDec-28-2016

Topic models (e.g., pLSA, LDA, sLDA) have been widely used for segmenting imagery. However, these models are confined to crisp segmentation, forcing a visual word (i.e., an image patch) to belong to one and only one topic. Yet, there are many images in which some regions cannot be assigned a crisp categorical label (e.g., transition regions between a foggy sky and the ground or between sand and water at a beach). In these cases, a visual word is best represented with partial memberships across multiple topics. To address this, we present a partial membership latent Dirichlet allocation (PM-LDA) model and an associated parameter estimation algorithm. This model can be useful for imagery where a visual word may be a mixture of multiple topics. Experimental results on visual and sonar imagery show that PM-LDA can produce both crisp and soft semantic image segmentations; a capability previous topic modeling methods do not have.

machine learning, natural language, pm-lda, (17 more...)

arXiv.org Machine Learning

1612.08936

Country:

North America > United States > Missouri > Boone County > Columbia (0.14)
North America > United States > Florida > Alachua County > Gainesville (0.14)

Genre: Research Report (0.64)

Industry:

Government > Military (0.93)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

A Review of Multivariate Distributions for Count Data Derived from the Poisson Distribution

Inouye, David I., Yang, Eunho, Allen, Genevera I., Ravikumar, Pradeep

arXiv.org Machine LearningDec-27-2016

The Poisson distribution has been widely studied and used for modeling univariate count-valued data. Multivariate generalizations of the Poisson distribution that permit dependencies, however, have been far less popular. Yet, real-world high-dimensional count-valued data found in word counts, genomics, and crime statistics, for example, exhibit rich dependencies, and motivate the need for multivariate distributions that can appropriately model this data. We review multivariate distributions derived from the univariate Poisson, categorizing these models into three main classes: 1) where the marginal distributions are Poisson, 2) where the joint distribution is a mixture of independent multivariate Poisson distributions, and 3) where the node-conditional distributions are derived from the Poisson. We discuss the development of multiple instances of these classes and compare the models in terms of interpretability and theory. Then, we empirically compare multiple models from each class on three real-world datasets that have varying data characteristics from different domains, namely traffic accident data, biological next generation sequencing data, and text data. These empirical experiments develop intuition about the comparative advantages and disadvantages of each class of multivariate distribution that was derived from the Poisson. Finally, we suggest new research directions as explored in the subsequent discussion section.

artificial intelligence, machine learning, natural language, (22 more...)

arXiv.org Machine Learning

1609.00066

Country: North America > United States (0.67)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
(2 more...)

Add feedback

Bayesian Basics, Explained

#artificialintelligenceDec-23-2016, 04:00:22 GMT

Editor's note: The following is an interview with Columbia University Professor Andrew Gelman conducted by Marketing scientist Kevin Gray, in which Gelman spells out the ABCs of Bayesian statistics. Andrew Gelman: Bayesian statistics uses the mathematical rules of probability to combines data with "prior information" to give inferences which (if the model being used is correct) are more precise than would be obtained by either source of information alone. Classical statistical methods avoid prior distributions. In classical statistics, you might include in your model a predictor (for example), or you might exclude it, or you might pool it as part of some larger set of predictors in order to get a more stable estimate. These are pretty much your only choices.

artificial intelligence, bayesian method, machine learning, (16 more...)

#artificialintelligence

Genre: Personal (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Bayesian Differential Privacy through Posterior Sampling

Dimitrakakis, Christos, Nelson, Blaine, Zhang, and Zuhe, Mitrokotsa, Aikaterini, Rubinstein, Benjamin

arXiv.org Machine LearningDec-23-2016

Differential privacy formalises privacy-preserving mechanisms that provide access to a database. We pose the question of whether Bayesian inference itself can be used directly to provide private access to data, with no modification. The answer is affirmative: under certain conditions on the prior, sampling from the posterior distribution can be used to achieve a desired level of privacy and utility. To do so, we generalise differential privacy to arbitrary dataset metrics, outcome spaces and distribution families. This allows us to also deal with non-i.i.d or non-tabular datasets. We prove bounds on the sensitivity of the posterior to the data, which gives a measure of robustness. We also show how to use posterior sampling to provide differentially private responses to queries, within a decision-theoretic framework. Finally, we provide bounds on the utility and on the distinguishability of datasets. The latter are complemented by a novel use of Le Cam's method to obtain lower bounds. All our general results hold for arbitrary database metrics, including those for the common definition of differential privacy. For specific choices of the metric, we give a number of examples satisfying our assumptions.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

1306.1066

Country:

Europe (0.28)
North America > United States (0.28)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Data Science (0.87)

Add feedback