AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

Identification of jump Markov linear models using particle filters

Svensson, Andreas, Schön, Thomas B., Lindsten, Fredrik

arXiv.org Machine LearningSep-25-2014

Jump Markov linear models consists of a finite number of linear state space models and a discrete variable encoding the jumps (or switches) between the different linear models. Identifying jump Markov linear models makes for a challenging problem lacking an analytical solution. We derive a new expectation maximization (EM) type algorithm that produce maximum likelihood estimates of the model parameters. Our development hinges upon recent progress in combining particle filters with Markov chain Monte Carlo methods in solving the nonlinear state smoothing problem inherent in the EM formulation. Key to our development is that we exploit a conditionally linear Gaussian substructure in the model, allowing for an efficient algorithm.

artificial intelligence, jump markov linear model, machine learning, (13 more...)

arXiv.org Machine Learning

doi: 10.1109/CDC.2014.7040409

1409.7287

Country: Europe (0.46)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

On tensor rank of conditional probability tables in Bayesian networks

Vomlel, Jiří, Tichavský, Petr

arXiv.org Artificial IntelligenceSep-22-2014

A difficult task in modeling with Bayesian networks is the elicitation of numerical parameters of Bayesian networks. A large number of parameters is needed to specify a conditional probability table (CPT) that has a larger parent set. In this paper we show that, most CPTs from real applications of Bayesian networks can actually be very well approximated by tables that require substantially less parameters. This observation has practical consequence not only for model elicitation but also for efficient probabilistic reasoning with these networks.

artificial intelligence, bayesian network, machine learning, (16 more...)

arXiv.org Artificial Intelligence

1409.6287

Country:

Europe > Czechia (0.16)
Europe > Spain (0.15)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Expectation Propagation

Raymond, Jack, Manoel, Andre, Opper, Manfred

arXiv.org Machine LearningSep-22-2014

Variational inference is a powerful concept that underlies many iterative approximation algorithms; expectation propagation, mean-field methods and belief propagations were all central themes at the school that can be perceived from this unifying framework. The lectures of Manfred Opper introduce the archetypal example of Expectation Propagation, before establishing the connection with the other approximation methods. Corrections by expansion about the expectation propagation are then explained. Finally some advanced inference topics and applications are explored in the final sections.

approximation, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1409.6179

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)

Add feedback

SAME but Different: Fast and High-Quality Gibbs Parameter Estimation

Zhao, Huasha, Jiang, Biye, Canny, John

arXiv.org Machine LearningSep-18-2014

Gibbs sampling is a workhorse for Bayesian inference but has several limitations when used for parameter estimation, and is often much slower than non-sampling inference methods. SAME (State Augmentation for Marginal Estimation) \cite{Doucet99,Doucet02} is an approach to MAP parameter estimation which gives improved parameter estimates over direct Gibbs sampling. SAME can be viewed as cooling the posterior parameter distribution and allows annealed search for the MAP parameters, often yielding very high quality (lower loss) estimates. But it does so at the expense of additional samples per iteration and generally slower performance. On the other hand, SAME dramatically increases the parallelism in the sampling schedule, and is an excellent match for modern (SIMD) hardware. In this paper we explore the application of SAME to graphical model inference on modern hardware. We show that combining SAME with factored sample representation (or approximation) gives throughput competitive with the fastest symbolic methods, but with potentially better quality. We describe experiments on Latent Dirichlet Allocation, achieving speeds similar to the fastest reported methods (online Variational Bayes) and lower cross-validated loss than other LDA implementations. The method is simple to implement and should be applicable to many other models.

artificial intelligence, implementation, machine learning, (17 more...)

arXiv.org Machine Learning

1409.5402

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.70)

Add feedback

Model-based Kernel Sum Rule

Nishiyama, Yu, Kanagawa, Motonobu, Gretton, Arthur, Fukumizu, Kenji

arXiv.org Machine LearningSep-17-2014

In this study, we enrich the framework of nonparametric kernel Bayesian inference via the flexible incorporation of certain probabilistic models, such as additive Gaussian noise models. Nonparametric inference expressed in terms of kernel means, which is called kernel Bayesian inference, has been studied using basic rules such as the kernel sum rule (KSR), kernel chain rule, kernel product rule, and kernel Bayes' rule (KBR). However, the current framework used for kernel Bayesian inference deals only with nonparametric inference and it cannot allow inference when combined with probabilistic models. In this study, we introduce a novel KSR, called model-based KSR (Mb-KSR), which exploits the knowledge obtained from some probabilistic models of conditional distributions. The incorporation of Mb-KSR into nonparametric kernel Bayesian inference facilitates more flexible kernel Bayesian inference than nonparametric inference. We focus on combinations of Mb-KSR, Non-KSR, and KBR, and we propose a filtering algorithm for state space models, which combines nonparametric learning of the observation process using kernel means and additive Gaussian noise models of the transition dynamics. The idea of the Mb-KSR for additive Gaussian noise models can be extended to more general noise model cases, including a conjugate pair with a positive-definite kernel and a probabilistic model.

artificial intelligence, machine learning, probabilistic model, (16 more...)

arXiv.org Machine Learning

1409.5178

Country: Asia > Japan (0.14)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Collapsed Variational Bayes Inference of Infinite Relational Model

Ishiguro, Katsuhiko, Sato, Issei, Ueda, Naonori

arXiv.org Machine LearningSep-16-2014

The Infinite Relational Model (IRM) is a probabilistic model for relational data clustering that partitions objects into clusters based on observed relationships. This paper presents Averaged CVB (ACVB) solutions for IRM, convergence-guaranteed and practically useful fast Collapsed Variational Bayes (CVB) inferences. We first derive ordinary CVB and CVB0 for IRM based on the lower bound maximization. CVB solutions yield deterministic iterative procedures for inferring IRM given the truncated number of clusters. Our proposal includes CVB0 updates of hyperparameters including the concentration parameter of the Dirichlet Process, which has not been studied in the literature. To make the CVB more practically useful, we further study the CVB inference in two aspects. First, we study the convergence issues and develop a convergence-guaranteed algorithm for any CVB-based inferences called ACVB, which enables automatic convergence detection and frees non-expert practitioners from difficult and costly manual monitoring of inference processes. Second, we present a few techniques for speeding up IRM inferences. In particular, we describe the linear time inference of CVB0, allowing the IRM for larger relational data uses. The ACVB solutions of IRM showed comparable or better performance compared to existing inference methods in experiments, and provide deterministic, faster, and easier convergence detection.

data mining, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

1409.4757

Country:

Asia > Japan (0.28)
North America > United States (0.28)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.54)
(2 more...)

Add feedback

Probabilistic Selection in AgentSpeak(L)

Coelho, Francisco, Nogueira, Vitor

arXiv.org Artificial IntelligenceSep-12-2014

Agent programming is mostly a symbolic discipline and, as such, draws little benefits from probabilistic areas as machine learning and graphical models. However, the greatest objective of agent research is the achievement of autonomy in dynamical and complex environments --- a goal that implies embracing uncertainty and therefore the entailed representations, algorithms and techniques. This paper proposes an innovative and conflict free two layer approach to agent programming that uses already established methods and tools from both symbolic and probabilistic artificial intelligence. Moreover, this framework is illustrated by means of a widely used agent programming example, GoldMiners.

artificial intelligence, bayesian inference, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-319-25524-8_44

1409.3717

Country:

South America > Brazil > Rio Grande do Sul (0.04)
Europe > Germany > Saarland (0.04)
Asia > India (0.04)

Genre: Research Report (0.40)

Industry: Materials > Metals & Mining (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)

Add feedback

Parsimonious Topic Models with Salient Word Discovery

Soleimani, Hossein, Miller, David J.

arXiv.org Machine LearningSep-11-2014

We propose a parsimonious topic model for text corpora. In related models such as Latent Dirichlet Allocation (LDA), all words are modeled topic-specifically, even though many words occur with similar frequencies across different topics. Our modeling determines salient words for each topic, which have topic-specific probabilities, with the rest explained by a universal shared model. Further, in LDA all topics are in principle present in every document. By contrast our model gives sparse topic representation, determining the (small) subset of relevant topics for each document. We derive a Bayesian Information Criterion (BIC), balancing model complexity and goodness of fit. Here, interestingly, we identify an effective sample size and corresponding penalty specific to each parameter type in our model. We minimize BIC to jointly determine our entire model -- the topic-specific words, document-specific topics, all model parameter values, {\it and} the total number of topics -- in a wholly unsupervised fashion. Results on three text corpora and an image dataset show that our model achieves higher test set likelihood and better agreement with ground-truth class labels, compared to LDA and to a model designed to incorporate sparsity.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

doi: 10.1109/TKDE.2014.2345378

1401.6169

Country: North America > United States > Pennsylvania (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.91)

Add feedback

Scalable Bayesian Modelling of Paired Symbols

Paquet, Ulrich, Koenigstein, Noam, Winther, Ole

arXiv.org Machine LearningSep-10-2014

We present a novel, scalable and Bayesian approach to modelling the occurrence of pairs of symbols (i, j) drawn from a large vocabulary. Observed pairs are assumed to be generated by a simple popularity based selection process followed by censoring using a preference function. By basing inference on the well-founded principle of variational bounding, and using new site-independent bounds, we show how a scalable inference procedure can be obtained for large data sets. State of the art results are presented on real-world movie viewing data.

artificial intelligence, likelihood, machine learning, (17 more...)

arXiv.org Machine Learning

1409.2824

Country: Europe (0.46)

Genre: Research Report (0.82)

Industry: Law > Civil Rights & Constitutional Law (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Add feedback

Context-specific independence in graphical log-linear models

Nyman, Henrik, Pensar, Johan, Koski, Timo, Corander, Jukka

arXiv.org Machine LearningSep-9-2014

Log-linear models are the popular workhorses of analyzing contingency tables. A log-linear parameterization of an interaction model can be more expressive than a direct parameterization based on probabilities, leading to a powerful way of defining restrictions derived from marginal, conditional and context-specific independence. However, parameter estimation is often simpler under a direct parameterization, provided that the model enjoys certain decomposability properties. Here we introduce a cyclical projection algorithm for obtaining maximum likelihood estimates of log-linear parameters under an arbitrary context-specific graphical log-linear model, which needs not satisfy criteria of decomposability. We illustrate that lifting the restriction of decomposability makes the models more expressive, such that additional context-specific independencies embedded in real data can be identified. It is also shown how a context-specific graphical model can correspond to a non-hierarchical log-linear parameterization with a concise interpretation. This observation can pave way to further development of non-hierarchical log-linear models, which have been largely neglected due to their believed lack of interpretability.

artificial intelligence, bayesian inference, machine learning, (20 more...)

arXiv.org Machine Learning

doi: 10.1007/s00180-015-0606-6

1409.2713

Country: Europe > Finland (0.15)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback