instrumental distribution
- Asia > Middle East > Jordan (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.52)
- Asia > Middle East > Jordan (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.52)
Reviews: Pseudo-Extended Markov chain Monte Carlo
Update: I have read the author response and am satisfied with the commitment to elaborate on \beta and \pi and to simplify the Stan PE code with a "pseudo-extended" function. This paper presents a new MCMC sampling method called pseudo-extended MCMC that uses an instrumental distribution to projects the data into a higher-dimensional space where the modes are connected, making it easier for the sampler to mix. A default instrumental distribution based on tempering is provided. The method is compared to existing baselines showing efficacy on three benchmark datasets. The paper is well-placed within the existing literature.
Binary classification based Monte Carlo simulation
Argouarc'h, Elouan, Desbouvries, François
Acceptance-rejection (AR), Independent Metropolis Hastings (IMH) or importance sampling (IS) Monte Carlo (MC) simulation algorithms all involve computing ratios of probability density functions (pdfs). On the other hand, classifiers discriminate labeled samples produced by a mixture of two distributions and can be used for approximating the ratio of the two corresponding pdfs.This bridge between simulation and classification enables us to propose pdf-free versions of pdf-ratio-based simulation algorithms, where the ratio is replaced by a surrogate function computed via a classifier. From a probabilistic modeling perspective, our procedure involves a structured energy based model which can easily be trained and is compatible with the classical samplers.
- North America > United States > New York (0.04)
- North America > United States > New Jersey > Hudson County > Secaucus (0.04)
- North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
In Search of an Entity Resolution OASIS: Optimal Asymptotic Sequential Importance Sampling
Marchant, Neil G., Rubinstein, Benjamin I. P.
Entity resolution (ER) presents unique challenges for evaluation methodology. While crowdsourcing platforms acquire ground truth, sound approaches to sampling must drive labelling efforts. In ER, extreme class imbalance between matching and non-matching records can lead to enormous labelling requirements when seeking statistically consistent estimates for rigorous evaluation. This paper addresses this important challenge with the OASIS algorithm: a sampler and F-measure estimator for ER evaluation. OASIS draws samples from a (biased) instrumental distribution, chosen to ensure estimators with optimal asymptotic variance. As new labels are collected OASIS updates this instrumental distribution via a Bayesian latent variable model of the annotator oracle, to quickly focus on unlabelled items providing more information. We prove that resulting estimates of F-measure, precision, recall converge to the true population values. Thorough comparisons of sampling methods on a variety of ER datasets demonstrate significant labelling reductions of up to 83% without loss to estimate accuracy.
- Oceania > Australia > Victoria > Melbourne (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.66)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.61)