AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

Inference of Sparse Networks with Unobserved Variables. Application to Gene Regulatory Networks

Slavov, Nikolai

arXiv.org Machine LearningJun-1-2014

Networks are a unifying framework for modeling complex systems and network inference problems are frequently encountered in many fields. Here, I develop and apply a generative approach to network inference (RCweb) for the case when the network is sparse and the latent (not observed) variables affect the observed ones. From all possible factor analysis (FA) decompositions explaining the variance in the data, RCweb selects the FA decomposition that is consistent with a sparse underlying network. The sparsity constraint is imposed by a novel method that significantly outperforms (in terms of accuracy, robustness to noise, complexity scaling, and computational efficiency) Bayesian methods and MLE methods using l1 norm relaxation such as K-SVD and l1--based sparse principle component analysis (PCA). Results from simulated models demonstrate that RCweb recovers exactly the model structures for sparsity as low (as non-sparse) as 50% and with ratio of unobserved to observed variables as high as 2. RCweb is robust to noise, with gradual decrease in the parameter ranges as the noise level increases.

artificial intelligence, bayesian inference, machine learning, (15 more...)

arXiv.org Machine Learning

1406.0193

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report (0.70)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Adaptive Reconfiguration Moves for Dirichlet Mixtures

Herlau, Tue, Mørup, Morten, Teh, Yee Whye, Schmidt, Mikkel N.

arXiv.org Machine LearningMay-31-2014

Bayesian mixture models are widely applied for unsupervised learning and exploratory data analysis. Markov chain Monte Carlo based on Gibbs sampling and split-merge moves are widely used for inference in these models. However, both methods are restricted to limited types of transitions and suffer from torpid mixing and low accept rates even for problems of modest size. We propose a method that considers a broader range of transitions that are close to equilibrium by exploiting multiple chains in parallel and using the past states adaptively to inform the proposal distribution. The method significantly improves on Gibbs and split-merge sampling as quantified using convergence diagnostics and acceptance rates. Adaptive MCMC methods which use past states to inform the proposal distribution has given rise to many ingenious sampling schemes for continuous problems and the present work can be seen as an important first step in bringing these benefits to partition-based problems.

artificial intelligence, iteration, machine learning, (16 more...)

arXiv.org Machine Learning

1406.0071

Country: Europe (0.67)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

The Infinite Degree Corrected Stochastic Block Model

Herlau, Tue, Schmidt, Mikkel N., Mørup, Morten

arXiv.org Machine LearningMay-30-2014

In Stochastic blockmodels, which are among the most prominent statistical models for cluster analysis of complex networks, clusters are defined as groups of nodes with statistically similar link probabilities within and between groups. A recent extension by Karrer and Newman incorporates a node degree correction to model degree heterogeneity within each group. Although this demonstrably leads to better performance on several networks it is not obvious whether modelling node degree is always appropriate or necessary. We formulate the degree corrected stochastic blockmodel as a non-parametric Bayesian model, incorporating a parameter to control the amount of degree correction which can then be inferred from data. Additionally, our formulation yields principled ways of inferring the number of groups as well as predicting missing links in the network which can be used to quantify the model's predictive performance. On synthetic data we demonstrate that including the degree correction yields better performance both on recovering the true group structure and predicting missing links when degree heterogeneity is present, whereas performance is on par for data with no degree heterogeneity within clusters. On seven real networks (with no ground truth group structure available) we show that predictive performance is about equal whether or not degree correction is included; however, for some networks significantly fewer clusters are discovered when correcting for degree indicating that the data can be more compactly explained by clusters of heterogenous degree nodes.

artificial intelligence, degree heterogeneity, machine learning, (15 more...)

arXiv.org Machine Learning

doi: 10.1103/PhysRevE.90.032819

1311.252

Country: North America > United States (0.14)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Add feedback

Efficient State-Space Inference of Periodic Latent Force Models

Reece, Steven, Roberts, Stephen, Ghosh, Siddhartha, Rogers, Alex, Jennings, Nicholas

arXiv.org Machine LearningMay-29-2014

Latent force models (LFM) are principled approaches to incorporating solutions to differential equations within non-parametric inference methods. Unfortunately, the development and application of LFMs can be inhibited by their computational cost, especially when closed-form solutions for the LFM are unavailable, as is the case in many real world problems where these latent forces exhibit periodic behaviour. Given this, we develop a new sparse representation of LFMs which considerably improves their computational efficiency, as well as broadening their applicability, in a principled way, to domains with periodic or near periodic latent forces. Our approach uses a linear basis model to approximate one generative model for each periodic force. We assume that the latent forces are generated from Gaussian process priors and develop a linear basis model which fully expresses these priors. We apply our approach to model the thermal dynamics of domestic buildings and show that it is effective at predicting day-ahead temperatures within the homes. We also apply our approach within queueing theory in which quasi-periodic arrival rates are modelled as latent forces. In both cases, we demonstrate that our approach can be implemented efficiently using state-space methods which encode the linear dynamic systems via LFMs. Further, we show that state estimates obtained using periodic latent force models can reduce the root mean squared error to 17% of that from non-periodic models and 27% of the nearest rival approach which is the resonator model.

artificial intelligence, bayesian inference, equation, (18 more...)

arXiv.org Machine Learning

1310.6319

Country:

Asia > Middle East > Saudi Arabia (0.14)
Europe > United Kingdom > Scotland (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.92)

Add feedback

Functional Gaussian processes for regression with linear PDE models

Nguyen, Ngoc-Cuong, Peraire, Jaime

arXiv.org Machine LearningMay-29-2014

In this paper, we present a new statistical approach to the problem of incorporating experimental observations into a mathematical model described by linear partial differential equations (PDEs) to improve the prediction of the state of a physical system. We augment the linear PDE with a functional that accounts for the uncertainty in the mathematical model and is modeled as a {\em Gaussian process}. This gives rise to a stochastic PDE which is characterized by the Gaussian functional. We develop a {\em functional Gaussian process regression} method to determine the posterior mean and covariance of the Gaussian functional, thereby solving the stochastic PDE to obtain the posterior distribution for our prediction of the physical state. Our method has the following features which distinguish itself from other regression methods. First, it incorporates both the mathematical model and the observations into the regression procedure. Second, it can handle the observations given in the form of linear functionals of the field variable. Third, the method is non-parametric in the sense that it provides a systematic way to optimally determine the prior covariance operator of the Gaussian functional based on the observations. Fourth, it provides the posterior distribution quantifying the magnitude of uncertainty in our prediction of the physical state. We present numerical results to illustrate these features of the method and compare its performance to that of the standard Gaussian process regression.

artificial intelligence, machine learning, regression, (17 more...)

arXiv.org Machine Learning

1405.7569

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.55)

Add feedback

Supervised Dictionary Learning by a Variational Bayesian Group Sparse Nonnegative Matrix Factorization

Ivek, Ivan

arXiv.org Machine LearningMay-27-2014

INCE the appearance of the seminal paper [1], NMF has become a popular data decomposition technique due to succesful applications in a still growing number of fields where data are nonnegative, such as pixel intensities in computer vision, amplitude spectra in audio signal analysis and EEG signal analysis, term counts in document clustering problems, and item ratings in collaborative filtering. NMF aims at decompositions, where, and are all nonnegative matrices. Throughout this paper will be regarded as a collection of data samples organized columnwise, as a dictionary of features organized columnwise, and as matrix of coefficients when is projected onto the dictionary. Under assumptions of linearity and nonnegativity, when underlying dimensionality is lower than dimensionality of the original space of the data, dimensionality reduction of the data can effectively be achieved this way. Although the decomposition is nonunique in general, NMF is able to produce strictly additive decompositions perceived as part-based by adding additional bias in the model [1], [2]. To this end, different sparsity promoting regularizers have been proposed for divergence-based NMF [3]. Also, to include higher order data descriptions, many other variants have been developed, e.g.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

1405.6914

Country: Europe > Croatia (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Bayesian Inference for Gaussian Process Classifiers with Annealing and Pseudo-Marginal MCMC

Filippone, Maurizio

arXiv.org Machine LearningMay-26-2014

Kernel methods have revolutionized the fields of pattern recognition and machine learning. Their success, however, critically depends on the choice of kernel parameters. Using Gaussian process (GP) classification as a working example, this paper focuses on Bayesian inference of covariance (kernel) parameters using Markov chain Monte Carlo (MCMC) methods. The motivation is that, compared to standard optimization of kernel parameters, they have been systematically demonstrated to be superior in quantifying uncertainty in predictions. Recently, the Pseudo-Marginal MCMC approach has been proposed as a practical inference tool for GP models. In particular, it amounts in replacing the analytically intractable marginal likelihood by an unbiased estimate obtainable by approximate methods and importance sampling. After discussing the potential drawbacks in employing importance sampling, this paper proposes the application of annealed importance sampling. The results empirically demonstrate that compared to importance sampling, annealed importance sampling can reduce the variance of the estimate of the marginal likelihood exponentially in the number of data at a computational cost that scales only polynomially. The results on real data demonstrate that employing annealed importance sampling in the Pseudo-Marginal MCMC approach represents a step forward in the development of fully automated exact inference engines for GP models.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1311.732

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Add feedback

Scalable Recommendation with Poisson Factorization

Gopalan, Prem, Hofman, Jake M., Blei, David M.

arXiv.org Artificial IntelligenceMay-20-2014

We develop a Bayesian Poisson matrix factorization model for forming recommendations from sparse user behavior data. These data are large user/item matrices where each user has provided feedback on only a small subset of items, either explicitly (e.g., through star ratings) or implicitly (e.g., through views or purchases). In contrast to traditional matrix factorization approaches, Poisson factorization implicitly models each user's limited attention to consume items. Moreover, because of the mathematical form of the Poisson likelihood, the model needs only to explicitly consider the observed entries in the matrix, leading to both scalable computation and good predictive performance. We develop a variational inference algorithm for approximate posterior inference that scales up to massive data sets. This is an efficient algorithm that iterates over the observed entries and adjusts an approximate posterior over the user/item representations. We apply our method to large real-world user data containing users rating movies, users listening to songs, and users reading scientific papers. In all these settings, Bayesian Poisson factorization outperforms state-of-the-art matrix factorization methods.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Artificial Intelligence

1311.1704

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Energy > Renewable > Biofuel (0.68)
Materials > Chemicals > Commodity Chemicals (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Modelling Data Dispersion Degree in Automatic Robust Estimation for Multivariate Gaussian Mixture Models with an Application to Noisy Speech Processing

Wu, Dalei, Wu, Haiqing

arXiv.org Machine LearningMay-19-2014

The trimming scheme with a prefixed cutoff portion is known as a method of improving the robustness of statistical models such as multivariate Gaussian mixture models (MG- MMs) in small scale tests by alleviating the impacts of outliers. However, when this method is applied to real- world data, such as noisy speech processing, it is hard to know the optimal cut-off portion to remove the outliers and sometimes removes useful data samples as well. In this paper, we propose a new method based on measuring the dispersion degree (DD) of the training data to avoid this problem, so as to realise automatic robust estimation for MGMMs. The DD model is studied by using two different measures. For each one, we theoretically prove that the DD of the data samples in a context of MGMMs approximately obeys a specific (chi or chi-square) distribution. The proposed method is evaluated on a real-world application with a moderately-sized speaker recognition task. Experiments show that the proposed method can significantly improve the robustness of the conventional training method of GMMs for speaker recognition.

artificial intelligence, machine learning, pattern recognition, (18 more...)

arXiv.org Machine Learning

1405.4599

Country:

Europe > Germany (0.46)
North America (0.46)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.32)

Add feedback

Bayesian estimation of possible causal direction in the presence of latent confounders using a linear non-Gaussian acyclic structural equation model with individual-specific effects

Shimizu, Shohei, Bollen, Kenneth

arXiv.org Machine LearningMay-19-2014

We consider learning the possible causal direction of two observed variables in the presence of latent confounding variables. Several existing methods have been shown to consistently estimate causal direction assuming linear or some type of nonlinear relationship and no latent confounders. However, the estimation results could be distorted if either assumption is actually violated. In this paper, we first propose a new linear non-Gaussian acyclic structural equation model with individual-specific effects that allows latent confounders to be considered. We then propose an empirical Bayesian approach for estimating possible causal direction using the new model. We demonstrate the effectiveness of our method using artificial and real-world data.

artificial intelligence, latent confounder, machine learning, (17 more...)

arXiv.org Machine Learning

1310.6778

Country: North America > United States > North Carolina (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback