AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

Metropolis Sampling

Martino, Luca, Elvira, Victor

arXiv.org Machine LearningApr-15-2017

Monte Carlo (MC) sampling methods are widely applied in Bayesian inference, system simulation and optimization problems. The Markov Chain Monte Carlo (MCMC) algorithms are a well-known class of MC methods which generate a Markov chain with the desired invariant distribution. In this document, we focus on the Metropolis-Hastings (MH) sampler, which can be considered as the atom of the MCMC techniques, introducing the basic notions and different properties. We describe in details all the elements involved in the MH algorithm and the most relevant variants. Several improvements and recent extensions proposed in the literature are also briefly discussed, providing a quick but exhaustive overview of the current Metropolis-based sampling's world.

artificial intelligence, bayesian inference, machine learning, (13 more...)

arXiv.org Machine Learning

1704.04629

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

Applying Bayes Theorem to a Big Data World

#artificialintelligenceApr-14-2017, 17:20:14 GMT

It is not death that a man should fear but he should fear never beginning to live.

artificial intelligence, data mining, machine learning, (3 more...)

#artificialintelligence

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.40)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.40)

Add feedback

Infinite Sparse Structured Factor Analysis

Pearce, Matthew C., White, Simon R.

arXiv.org Machine LearningApr-13-2017

Matrix factorisation methods decompose multivariate observations as linear combinations of latent feature vectors. The Indian Buffet Process (IBP) provides a way to model the number of latent features required for a good approximation in terms of regularised reconstruction error. Previous work has focussed on latent feature vectors with independent entries. We extend the model to include nondiagonal latent covariance structures representing characteristics such as smoothness. This is done by . Using simulations we demonstrate that under appropriate conditions a smoothness prior helps to recover the true latent features, while denoising more accurately. We demonstrate our method on a real neuroimaging dataset, where computational tractability is a sufficient challenge that the efficient strategy presented here is essential.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

1704.04031

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.89)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Beyond Uniform Priors in Bayesian Network Structure Learning

Scutari, Marco

arXiv.org Machine LearningApr-12-2017

Bayesian network structure learning is often performed in a Bayesian setting, evaluating candidate structures using their posterior probabilities for a given data set. Score-based algorithms then use those posterior probabilities as an objective function and return the maximum a posteriori network as the learned model. For discrete Bayesian networks, the canonical choice for a posterior score is the Bayesian Dirichlet equivalent uniform (BDeu) marginal likelihood with a uniform (U) graph prior, which assumes a uniform prior both on the network structures and on the parameters of the networks. In this paper, we investigate the problems arising from these assumptions, focusing on those caused by small sample sizes and sparse data. We then propose an alternative posterior score: the Bayesian Dirichlet sparse (BDs) marginal likelihood with a marginal uniform (MU) graph prior. Like U BDeu, MU BDs does not require any prior information on the probabilistic structure of the data and can be used as a replacement noninformative score. We study its theoretical properties and we evaluate its performance in an extensive simulation study, showing that MU BDs is both more accurate than U BDeu in learning the structure of the network and competitive in predicting power, while not being computationally more complex to estimate.

artificial intelligence, bdeu, machine learning, (17 more...)

arXiv.org Machine Learning

1704.03942

Country: Europe > United Kingdom (0.27)

Genre: Research Report > Experimental Study (0.89)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Sampling-based speech parameter generation using moment-matching networks

Takamichi, Shinnosuke, Koriyama, Tomoki, Saruwatari, Hiroshi

arXiv.org Machine LearningApr-12-2017

This paper presents sampling-based speech parameter generation using moment-matching networks for Deep Neural Network (DNN)-based speech synthesis. Although people never produce exactly the same speech even if we try to express the same linguistic and para-linguistic information, typical statistical speech synthesis produces completely the same speech, i.e., there is no inter-utterance variation in synthetic speech. To give synthetic speech natural inter-utterance variation, this paper builds DNN acoustic models that make it possible to randomly sample speech parameters. The DNNs are trained so that they make the moments of generated speech parameters close to those of natural speech parameters. Since the variation of speech parameters is compressed into a low-dimensional simple prior noise vector, our algorithm has lower computation cost than direct sampling of speech parameters. As the first step towards generating synthetic speech that has natural inter-utterance variation, this paper investigates whether or not the proposed sampling-based generation deteriorates synthetic speech quality. In evaluation, we compare speech quality of conventional maximum likelihood-based generation and proposed sampling-based generation. The result demonstrates the proposed generation causes no degradation in speech quality.

artificial intelligence, bayesian inference, machine learning, (20 more...)

arXiv.org Machine Learning

1704.03626

Country:

Europe > Germany (0.29)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
(2 more...)

Add feedback

The Stochastic complexity of spin models: How simple are simple spin models?

Beretta, Alberto, Battistin, Claudia, de Mulatier, Clélia, Mastromatteo, Iacopo, Marsili, Matteo

arXiv.org Machine LearningApr-12-2017

The Stochastic complexity of spin models: How simple are simple spin models? Alberto Beretta, 1 Claudia Battistin, 2 Cl elia de Mulatier, 1 Iacopo Mastromatteo, 3 and Matteo Marsili 1 1 The Abdus Salam International Centre for Theoretical Physics (ICTP), Strada Costiera 11, I-34014 Trieste, Italy 2 Kavli Institute for Systems Neuroscience and Centre for Neural Computation, Olav Kyrres gate 9, 7030 Trondheim, Norway 3 Capital Fund Management, 23 rue de l'Universit e, 75007 Paris, France Simple models, in information theoretic terms, are those with a small stochastic complexity. We study the stochastic complexity of spin models with interactions of arbitrary order. Invariance with respect to bijections within the space of operators allows us to classify models in complexity classes. This invariance also shows that simplicity is not related to the order of the interactions, but rather to their mutual arrangement.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

1702.07549

Country:

Europe > Norway > Central Norway > Trøndelag > Trondheim (0.24)
Europe > Italy > Friuli Venezia Giulia > Trieste Province > Trieste (0.24)
Europe > France > Île-de-France > Paris > Paris (0.24)

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Bay Area Probabilistic Programming Meetup

#artificialintelligenceApr-11-2017, 17:05:27 GMT

Is probabilistic programming and Bayesian reasoning algorithms the next big thing in machine learning? The idea behind the probabilistic programming to machine learning is that the model of the data can be separated from the algorithms that do inference on the model. The allows you to devote your energy to building models tailored to your decision problem, as opposed to constraining your problem so it works with some machine learning tool. This idea opens machine learning to domain experts. Indeed, probabilistic programming grew out of probabilistic graphical models, which revolutionized AI by enabling expert knowledge to be built into graphs powered by Bayesian inference.

artificial intelligence, bay area probabilistic programming meetup, machine learning, (4 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.39)

Add feedback

Dense Distributions from Sparse Samples: Improved Gibbs Sampling Parameter Estimators for LDA

Papanikolaou, Yannis, Foulds, James R., Rubin, Timothy N., Tsoumakas, Grigorios

arXiv.org Machine LearningApr-11-2017

We introduce a novel approach for estimating Latent Dirichlet Allocation (LDA) parameters from collapsed Gibbs samples (CGS), by leveraging the full conditional distributions over the latent variable assignments to efficiently average over multiple samples, for little more computational cost than drawing a single additional collapsed Gibbs sample. Our approach can be understood as adapting the soft clustering methodology of Collapsed Variational Bayes (CVB0) to CGS parameter estimation, in order to get the best of both techniques. Our estimators can straightforwardly be applied to the output of any existing implementation of CGS, including modern accelerated variants. We perform extensive empirical comparisons of our estimators with those of standard collapsed inference algorithms on real-world data for both unsupervised LDA and Prior-LDA, a supervised variant of LDA for multi-label classification. Our results show a consistent advantage of our approach over traditional CGS under all experimental conditions, and over CVB0 inference in the majority of conditions. More broadly, our results highlight the importance of averaging over multiple samples in LDA parameter estimation, and the use of efficient computational techniques to do so.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

1505.02065

Country:

North America > United States > California (0.67)
Europe (0.67)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

10 Free Must-Read Books for Machine Learning and Data Science

#artificialintelligenceApr-10-2017, 23:55:29 GMT

This book provides an introduction to statistical learning methods. It is aimed for upper level undergraduate students, masters students and Ph.D. students in the non-mathematical sciences. The book also contains a number of R labs with detailed explanations on how to implement the various methods in real life settings, and should be a valuable resource for a practicing data scientist.

artificial intelligence, bayesian inference, machine learning, (11 more...)

#artificialintelligence

Genre:

Instructional Material > Course Syllabus & Notes (0.72)
Summary/Review (0.70)

Industry: Education (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)

Add feedback

Distributed Learning for Cooperative Inference

Nedić, Angelia, Olshevsky, Alex, Uribe, César A.

arXiv.org Machine LearningApr-10-2017

In a distributed system, the interactions between agents are usually restricted to follow certain constraints on the flow of information imposed by the network structure. Such information constraints cause the agents to only be able to use locally available information. This contrasts with centralized approaches where all information and computation resources are available at a single location [24, 68, 64, 62]. One traditional problem in decision-making is that of parameter estimation or statistical learning. Given a set of noisy observations coming from a joint distribution one would like to estimate a parameter or distribution that minimizes a certain loss function. For example, Maximum a Posteriori (MAP) or Minimum Least Squared Error (MLSE) estimators fit a parameter to some model of the observations. Both, MAP and MLSE estimators require some form of Bayesian posterior computation based on models that explain the observations for a given parameter. Computation of such a posteriori distributions depends on having exact models about the likelihood of the corresponding observations.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1704.02718

Country: North America > United States (0.93)

Genre: Research Report (0.82)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback