AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

The detour problem in a stochastic environment: Tolman revisited

Fakhari, Pegah, Khodadadi, Arash, Busemeyer, Jerome

arXiv.org Machine LearningSep-27-2017

We designed a grid world task to study human planning and re-planning behavior in an unknown stochastic environment. In our grid world, participants were asked to travel from a random starting point to a random goal position while maximizing their reward. Because they were not familiar with the environment, they needed to learn its characteristics from experience to plan optimally. Later in the task, we randomly blocked the optimal path to investigate whether and how people adjust their original plans to find a detour. To this end, we developed and compared 12 different models. These models were different on how they learned and represented the environment and how they planned to catch the goal. The majority of our participants were able to plan optimally. We also showed that people were capable of revising their plans when an unexpected event occurred. The result from the model comparison showed that the model-based reinforcement learning approach provided the best account for the data and outperformed heuristics in explaining the behavioral data in the re-planning trials.

artificial intelligence, machine learning, reinforcement learning, (21 more...)

arXiv.org Machine Learning

1709.09761

Country: North America > United States (0.93)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Add feedback

Multi-way Interacting Regression via Factorization Machines

Yurochkin, Mikhail, Nguyen, XuanLong, Vasiloglou, Nikolaos

arXiv.org Machine LearningSep-26-2017

We propose a Bayesian regression method that accounts for multi-way interactions of arbitrary orders among the predictor variables. Our model makes use of a factorization mechanism for representing the regression coefficients of interactions among the predictors, while the interaction selection is guided by a prior distribution on random hypergraphs, a construction which generalizes the Finite Feature Model. We present a posterior inference algorithm based on Gibbs sampling, and establish posterior consistency of our regression model. Our method is evaluated with extensive experiments on simulated data and demonstrated to be able to identify meaningful interactions in applications in genetics and retail demand forecasting.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

1709.09301

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Add feedback

Generative learning for deep networks

Flach, Boris, Shekhovtsov, Alexander, Fikar, Ondrej

arXiv.org Machine LearningSep-25-2017

Learning, taking into account full distribution of the data, referred to as generative, is not feasible with deep neural networks (DNNs) because they model only the conditional distribution of the outputs given the inputs. Current solutions are either based on joint probability models facing difficult estimation problems or learn two separate networks, mapping inputs to outputs (recognition) and vice-versa (generation). We propose an intermediate approach. First, we show that forward computation in DNNs with logistic sigmoid activations corresponds to a simplified approximate Bayesian inference in a directed probabilistic multi-layer model. This connection allows to interpret DNN as a probabilistic model of the output and all hidden units given the input. Second, we propose that in order for the recognition and generation networks to be more consistent with the joint model of the data, weights of the recognition and generator network should be related by transposition. We demonstrate in a tentative experiment that such a coupled pair can be learned generatively, modelling the full distribution of the data, and has enough capacity to perform well in both recognition and generation.

approximation, artificial intelligence, machine learning, (19 more...)

arXiv.org Machine Learning

1709.08524

Country: North America > Canada (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.71)

Add feedback

On the Model Shrinkage Effect of Gamma Process Edge Partition Models

Ohama, Iku, Sato, Issei, Kida, Takuya, Arimura, Hiroki

arXiv.org Machine LearningSep-25-2017

The edge partition model (EPM) is a fundamental Bayesian nonparametric model for extracting an overlapping structure from binary matrix. The EPM adopts a gamma process ($\Gamma$P) prior to automatically shrink the number of active atoms. However, we empirically found that the model shrinkage of the EPM does not typically work appropriately and leads to an overfitted solution. An analysis of the expectation of the EPM's intensity function suggested that the gamma priors for the EPM hyperparameters disturb the model shrinkage effect of the internal $\Gamma$P. In order to ensure that the model shrinkage effect of the EPM works in an appropriate manner, we proposed two novel generative constructions of the EPM: CEPM incorporating constrained gamma priors, and DEPM incorporating Dirichlet priors instead of the gamma priors. Furthermore, all DEPM's model parameters including the infinite atoms of the $\Gamma$P prior could be marginalized out, and thus it was possible to derive a truly infinite DEPM (IDEPM) that can be efficiently inferred using a collapsed Gibbs sampler. We experimentally confirmed that the model shrinkage of the proposed models works well and that the IDEPM indicated state-of-the-art performance in generalization ability, link prediction accuracy, mixing efficiency, and convergence speed.

artificial intelligence, epm, machine learning, (18 more...)

arXiv.org Machine Learning

1709.0877

Country: Asia > Japan (0.46)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)

Add feedback

Robust Probabilistic Modeling with Bayesian Data Reweighting

Wang, Yixin, Kucukelbir, Alp, Blei, David M.

arXiv.org Artificial IntelligenceSep-23-2017

Probabilistic models analyze data by relying on a set of assumptions. Data that exhibit deviations from these assumptions can undermine inference and prediction quality. Robust models offer protection against mismatch between a model's assumptions and reality. We propose a way to systematically detect and mitigate mismatch of a large class of probabilistic models. The idea is to raise the likelihood of each observation to a weight and then to infer both the latent variables and the weights from data. Inferring the weights allows a model to identify observations that match its assumptions and down-weight others. This enables robust inference and improves predictive accuracy. We study four different forms of mismatch with reality, ranging from missing latent groups to structure misspecification. A Poisson factorization analysis of the Movielens 1M dataset shows the benefits of this approach in a practical scenario.

artificial intelligence, drama 0, machine learning, (14 more...)

arXiv.org Artificial Intelligence

1606.0386

Country: North America > United States (1.00)

Genre: Research Report (0.83)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Approximate Bayesian Inference in Linear State Space Models for Intermittent Demand Forecasting at Scale

Seeger, Matthias, Rangapuram, Syama, Wang, Yuyang, Salinas, David, Gasthaus, Jan, Januschowski, Tim, Flunkert, Valentin

arXiv.org Machine LearningSep-22-2017

We present a scalable and robust Bayesian inference method for linear state space models. The method is applied to demand forecasting in the context of a large e-commerce platform, paying special attention to intermittent and bursty target statistics. Inference is approximated by the Newton-Raphson algorithm, reduced to linear-time Kalman smoothing, which allows us to operate on several orders of magnitude larger problems than previous related work. In a study on large real-world sales datasets, our method outperforms competing approaches on fast and medium moving items.

artificial intelligence, likelihood, machine learning, (17 more...)

arXiv.org Machine Learning

1709.07638

Country: Europe (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

Generalized Bayesian Updating and the Loss-Likelihood Bootstrap

Lyddon, Simon, Holmes, Chris, Walker, Stephen

arXiv.org Machine LearningSep-22-2017

In this paper, we revisit the weighted likelihood bootstrap and show that it is well-motivated for Bayesian inference under misspecified models. We extend the underlying idea to a wider family of inferential problems. This allows us to calibrate an analogue of the likelihood function in situations where little is known about the data-generating mechanism. We demonstrate our method on a number of examples. There are some problems that arise when Bayesian methods are applied in modern settings. The construction of a global probabilistic representation through a joint model of the environment is often an impossible task. If the data does not come from the ascribed probability model then the posterior uncertainty quantification is theoretically invalid; the coherence and rationality that is foundational to Bayesian theory is lost. Often there are a finite number of functionals (or parameters) of interest to the practitioner, or decisions to be made. In this case it would be desirable to target these parameters directly, making as few assumptions about the rest of the environment as possible.

artificial intelligence, machine learning, posterior, (19 more...)

arXiv.org Machine Learning

1709.07616

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

A Dirichlet Mixture Model of Hawkes Processes for Event Sequence Clustering

Xu, Hongteng, Zha, Hongyuan

arXiv.org Machine LearningSep-21-2017

We propose an effective method to solve the event sequence clustering problems based on a novel Dirichlet mixture model of a special but significant type of point processes --- Hawkes process. In this model, each event sequence belonging to a cluster is generated via the same Hawkes process with specific parameters, and different clusters correspond to different Hawkes processes. The prior distribution of the Hawkes processes is controlled via a Dirichlet distribution. We learn the model via a maximum likelihood estimator (MLE) and propose an effective variational Bayesian inference algorithm. We specifically analyze the resulting EM-type algorithm in the context of inner-outer iterations and discuss several inner iteration allocation strategies. The identifiability of our model, the convergence of our learning method, and its sample complexity are analyzed in both theoretical and empirical ways, which demonstrate the superiority of our method to other competitors. The proposed method learns the number of clusters automatically and is robust to model misspecification. Experiments on both synthetic and real-world data show that our method can learn diverse triggering patterns hidden in asynchronous event sequences and achieve encouraging performance on clustering purity and consistency.

artificial intelligence, hawke process, machine learning, (18 more...)

arXiv.org Machine Learning

1701.09177

Genre: Research Report (1.00)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Add feedback

An Expectation Conditional Maximization approach for Gaussian graphical models

Li, Zehang, McCormick, Tyler H.

arXiv.org Machine LearningSep-20-2017

Bayesian graphical models are a useful tool for understanding dependence relationships among many variables, particularly in situations with external prior information. In high-dimensional settings, the space of possible graphs becomes enormous, rendering even state-of-the-art Bayesian stochastic search computationally infeasible. We propose a deterministic alternative to estimate Gaussian and Gaussian copula graphical models using an Expectation Conditional Maximization (ECM) algorithm, extending the EM approach from Bayesian variable selection to graphical model estimation. We show that the ECM approach enables fast posterior exploration under a sequence of mixture priors, and can incorporate multiple sources of information.

artificial intelligence, bayesian inference, machine learning, (15 more...)

arXiv.org Machine Learning

1709.0697

Genre: Research Report (0.64)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Systems & Languages (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Modeling sequences and temporal networks with dynamic community structures

Peixoto, Tiago P., Rosvall, Martin

arXiv.org Machine LearningSep-20-2017

In evolving complex systems such as air traffic and social organizations, collective effects emerge from their many components' dynamic interactions. While the dynamic interactions can be represented by temporal networks with nodes and links that change over time, they remain highly complex. It is therefore often necessary to use methods that extract the temporal networks' large-scale dynamic community structure. However, such methods are subject to overfitting or suffer from effects of arbitrary, a priori imposed timescales, which should instead be extracted from data. Here we simultaneously address both problems and develop a principled data-driven method that determines relevant timescales and identifies patterns of dynamics that take place on networks as well as shape the networks themselves. We base our method on an arbitrary-order Markov chain model with community structure, and develop a nonparametric Bayesian inference framework that identifies the simplest such model that can explain temporal interaction data.

artificial intelligence, likelihood, machine learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1038/s41467-017-00148-9

1509.0474

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report (0.64)

Industry:

Transportation (1.00)
Education (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback