Parallelising MCMC via Random Forests

Nov-21-2019–arXiv.org Machine Learning

Markov chain Monte Carlo (MCMC) algorithm, a generic sampling method, is ubiquitous in modern statistics, especially in Bayesian fields. MCMC algorithms require only the evaluation of the target pointwise, up to a multiple constant, in order to sample from it. In Bayesian analysis, the object of main interest is the posterior, which is not in closed form in general, and MCMC has become a standard tool in this domain. However, MCMC is difficult to scale and its applications are limited when the observation size is very large, for it needs to sweep over the entire observations set in order to evaluate the likelihood function at each iteration. Recently, many methods have been proposed to better scale MCMC algorithms for big data sets and these can be roughly classified into two groups Bardenet et al. (2017): divide-and-conquer methods and subsampling-based methods. For divide-and-conquer methods, one splits the whole data set into subsets, runs MCMC over each subset to generate samples of parameters and combine these to produce an approximation of the true posterior. Depending on how MCMC is handled over the subsets, these methods can be further classified into two sub-categories.

posterior, subposterior, true posterior, (15 more...)

arXiv.org Machine Learning

Nov-21-2019

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.46)
  - Machine Learning
    - Statistical Learning (1.00)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.50)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found