AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

Markov Properties for Graphical Models with Cycles and Latent Variables

Forré, Patrick, Mooij, Joris M.

arXiv.org Machine LearningOct-24-2017

We investigate probabilistic graphical models that allow for both cycles and latent variables. For this we introduce directed graphs with hyperedges (HEDGes), generalizing and combining both marginalized directed acyclic graphs (mDAGs) that can model latent (dependent) variables, and directed mixed graphs (DMGs) that can model cycles. We define and analyse several different Markov properties that relate the graphical structure of a HEDG with a probability distribution on a corresponding product space over the set of nodes, for example factorization properties, structural equations properties, ordered/local/global Markov properties, and marginal versions of these. The various Markov properties for HEDGes are in general not equivalent to each other when cycles or hyperedges are present, in contrast with the simpler case of directed acyclic graphical (DAG) models (also known as Bayesian networks). We show how the Markov properties for HEDGes - and thus the corresponding graphical Markov models - are logically related to each other.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

1710.08775

Country:

Europe (1.00)
North America > United States > California (0.67)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)

Add feedback

Scalable Exact Parent Sets Identification in Bayesian Networks Learning with Apache Spark

Karan, Subhadeep, Zola, Jaroslaw

arXiv.org Artificial IntelligenceOct-24-2017

In Machine Learning, the parent set identification problem is to find a set of random variables that best explain selected variable given the data and some predefined scoring function. This problem is a critical component to structure learning of Bayesian networks and Markov blankets discovery, and thus has many practical applications, ranging from fraud detection to clinical decision support. In this paper, we introduce a new distributed memory approach to the exact parent sets assignment problem. To achieve scalability, we derive theoretical bounds to constraint the search space when MDL scoring function is used, and we reorganize the underlying dynamic programming such that the computational density is increased and fine-grain synchronization is eliminated. We then design efficient realization of our approach in the Apache Spark platform. Through experimental results, we demonstrate that the method maintains strong scalability on a 500-core standalone Spark cluster, and it can be used to efficiently process data sets with 70 variables, far beyond the reach of the currently available solutions.

artificial intelligence, machine learning, maximal parent, (18 more...)

arXiv.org Artificial Intelligence

1705.0639

Genre: Research Report > New Finding (0.68)

Industry:

Information Technology (0.46)
Law Enforcement & Public Safety > Fraud (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

An Expectation Maximization Framework for Preferential Attachment Models

Roberts, Lucas, Roberts, Denisa

arXiv.org Machine LearningOct-23-2017

In this paper we develop an Expectation Maximization(EM) algorithm to estimate the parameter of a Yule-Simon distribution. The Yule-Simon distribution exhibits the "rich get richer" effect whereby an 80-20 type of rule tends to dominate. These distributions are ubiquitous in industrial settings. The EM algorithm presented provides both frequentist and Bayesian estimates of the $\lambda$ parameter. By placing the estimation method within the EM framework we are able to derive Standard errors of the resulting estimate. Additionally, we prove convergence of the Yule-Simon EM algorithm and study the rate of convergence. An explicit, closed form solution for the rate of convergence of the algorithm is given.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1710.08511

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.93)

Add feedback

AI - The present in the making

#artificialintelligenceOct-22-2017, 22:05:18 GMT

For many people, the concept of Artificial Intelligence (AI) is a thing of the future. It is the technology that has yet to be introduced. But Professor Jon Oberlander disagrees. He was quick to point out that AI is not in the future, it is now in the making. He began by mentioning Alexa, Amazon's star product. It is an artificial intelligent personal assistant, which was made popular by Amazon Echo devices.

artificial intelligence, machine learning, natural language, (6 more...)

#artificialintelligence

Country: North America > United States (0.17)

Industry: Information Technology (0.53)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.76)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.57)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.33)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.33)

Add feedback

Sequential Matrix Completion

Marsden, Annie, Bacallado, Sergio

arXiv.org Machine LearningOct-22-2017

We propose a novel algorithm for sequential matrix completion in a recommender system setting, where the $(i,j)$th entry of the matrix corresponds to a user $i$'s rating of product $j$. The objective of the algorithm is to provide a sequential policy for user-product pair recommendation which will yield the highest possible ratings after a finite time horizon. The algorithm uses a Gamma process factor model with two posterior-focused bandit policies, Thompson Sampling and Information-Directed Sampling. While Thompson Sampling shows competitive performance in simulations, state-of-the-art performance is obtained from Information-Directed Sampling, which makes its recommendations based off a ratio between the expected reward and a measure of information gain. To our knowledge, this is the first implementation of Information Directed Sampling on large real datasets. This approach contributes to a recent line of research on bandit approaches to collaborative filtering including Kawale et al. (2015), Li et al. (2010), Bresler et al. (2014), Li et al. (2016), Deshpande & Montanari (2012), and Zhao et al. (2013). The setting of this paper, as has been noted in Kawale et al. (2015) and Zhao et al. (2013), presents significant challenges to bounding regret after finite horizons. We discuss these challenges in relation to simpler models for bandits with side information, such as linear or gaussian process bandits, and hope the experiments presented here motivate further research toward theoretical guarantees.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Machine Learning

1710.08045

Country: North America (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Data Science > Data Mining > Big Data (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

Add feedback

A Tight Excess Risk Bound via a Unified PAC-Bayesian-Rademacher-Shtarkov-MDL Complexity

Grünwald, Peter D., Mehta, Nishant A.

arXiv.org Machine LearningOct-20-2017

We present a novel notion of complexity that interpolates between and generalizes some classic existing complexity notions in learning theory: for estimators like empirical risk minimization (ERM) with arbitrary bounded losses, it is upper bounded in terms of data-independent Rademacher complexity; for generalized Bayesian estimators, it is upper bounded by the data-dependent information complexity (also known as stochastic or PAC-Bayesian, $\mathrm{KL}(\text{posterior} \operatorname{\|} \text{prior})$ complexity. For (penalized) ERM, the new complexity reduces to (generalized) normalized maximum likelihood (NML) complexity, i.e. a minimax log-loss individual-sequence regret. Our first main result bounds excess risk in terms of the new complexity. Our second main result links the new complexity via Rademacher complexity to $L_2(P)$ entropy, thereby generalizing earlier results of Opper, Haussler, Lugosi, and Cesa-Bianchi who did the log-loss case with $L_\infty$. Together, these results recover optimal bounds for VC- and large (polynomial entropy) classes, replacing localized Rademacher complexity by a simpler analysis which almost completely separates the two aspects that determine the achievable rates: 'easiness' (Bernstein) conditions and model complexity.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1710.07732

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback

On the Consistency of Graph-based Bayesian Learning and the Scalability of Sampling Algorithms

Trillos, Nicolas Garcia, Kaplan, Zachary, Samakhoana, Thabo, Sanz-Alonso, Daniel

arXiv.org Machine LearningOct-20-2017

A popular approach to semi-supervised learning proceeds by endowing the input data with a graph structure in order to extract geometric information and incorporate it into a Bayesian framework. We introduce new theory that gives appropriate scalings of graph parameters that provably lead to a well-defined limiting posterior as the size of the unlabeled data set grows. Furthermore, we show that these consistency results have profound algorithmic implications. When consistency holds, carefully designed graph-based Markov chain Monte Carlo algorithms are proved to have a uniform spectral gap, independent of the number of unlabeled inputs. Several numerical experiments corroborate both the statistical consistency and the algorithmic scalability established by the theory.

artificial intelligence, machine learning, posterior, (16 more...)

arXiv.org Machine Learning

1710.07702

Country: North America > United States (0.28)

Genre:

Research Report (0.81)
Instructional Material (0.67)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Finite-dimensional Gaussian approximation with linear inequality constraints

López-Lopera, Andrés F., Bachoc, François, Durrande, Nicolas, Roustant, Olivier

arXiv.org Machine LearningOct-20-2017

Introducing inequality constraints in Gaussian process (GP) models can lead to more realistic uncertainties in learning a great variety of real-world problems. We consider the finite-dimensional Gaussian approach from Maatouk and Bay (2017) which can satisfy inequality conditions everywhere (either boundedness, monotonicity or convexity). Our contributions are threefold. First, we extend their approach in order to deal with general sets of linear inequalities. Second, we explore several Markov Chain Monte Carlo (MCMC) techniques to approximate the posterior distribution. Third, we investigate theoretical and numerical properties of the constrained likelihood for covariance parameter estimation. According to experiments on both artificial and real data, our full framework together with a Hamiltonian Monte Carlo-based sampler provides efficient results on both data fitting and uncertainty quantification.

artificial intelligence, constraint, machine learning, (18 more...)

arXiv.org Machine Learning

1710.07453

Country: North America > United States (1.00)

Genre: Research Report (0.40)

Industry: Energy (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Getting Started with Particle Metropolis-Hastings for Inference in Nonlinear Dynamical Models

Dahlin, Johan, Schön, Thomas B.

arXiv.org Machine LearningOct-19-2017

This tutorial provides a gentle introduction to the particle Metropolis-Hastings (PMH) algorithm for parameter inference in nonlinear state-space models together with a software implementation in the statistical programming language R. We employ a step-by-step approach to develop an implementation of the PMH algorithm (and the particle filter within) together with the reader. This final implementation is also available as the package pmhtutorial in the CRAN repository. Throughout the tutorial, we provide some intuition as to how the algorithm operates and discuss some solutions to problems that might occur in practice. To illustrate the use of PMH, we consider parameter inference in a linear Gaussian state-space model with synthetic data and a nonlinear stochastic volatility model with real-world data.

machine learning, particle filter, programming language, (17 more...)

arXiv.org Machine Learning

1511.01707

Country:

Europe > Sweden (0.46)
North America > United States (0.28)
Europe > United Kingdom > England (0.28)
Europe > Austria (0.28)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)
Information Technology > Software > Programming Languages (0.88)
(2 more...)

Add feedback

Probabilistic Integration: A Role in Statistical Computation?

Briol, François-Xavier, Oates, Chris. J., Girolami, Mark, Osborne, Michael A., Sejdinovic, Dino

arXiv.org Machine LearningOct-18-2017

A research frontier has emerged in scientific computation, wherein numerical error is regarded as a source of epistemic uncertainty that can be modelled. This raises several statistical challenges, including the design of statistical methods that enable the coherent propagation of probabilities through a (possibly deterministic) computational work-flow. This paper examines the case for probabilistic numerical methods in routine statistical computation. Our focus is on numerical integration, where a probabilistic integrator is equipped with a full distribution over its output that reflects the presence of an unknown numerical error. Our main technical contribution is to establish, for the first time, rates of posterior contraction for these methods. These show that probabilistic integrators can in principle enjoy the "best of both worlds", leveraging the sampling efficiency of Monte Carlo methods whilst providing a principled route to assess the impact of numerical error on scientific conclusions. Several substantial applications are provided for illustration and critical evaluation, including examples from statistical modelling, computer graphics and a computer model for an oil reservoir.

bayesian inference, integration, upstream oil & gas, (21 more...)

arXiv.org Machine Learning

1512.00933

Country:

North America > United States (0.67)
Europe > United Kingdom > North Sea (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report > New Finding (0.93)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Mathematics of Computing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback