AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

Artificial Intelligence #3:kNN & Bayes Classification method

@machinelearnbotApr-9-2018, 05:20:30 GMT

In this Course you learn k-Nearest Neighbors & Naive Bayes Classification Methods. In pattern recognition, the k-nearest neighbors algorithm (k-NN) is a non-parametric method used for classification and regression. The k-NN algorithm is among the simplest of all machine learning algorithms. For classification, a useful technique can be to assign weight to the contributions of the neighbors, so that the nearer neighbors contribute more to the average than the more distant ones. The neighbors are taken from a set of objects for which the class (for k-NN classification).

artificial intelligence, classifier, knn & bayes classification method, (8 more...)

@machinelearnbot

Genre: Instructional Material > Course Syllabus & Notes (0.76)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (0.40)
Education > Educational Setting > Online (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Generalized Dual Decomposition for Bounding Maximum Expected Utility of Influence Diagrams with Perfect Recall

Lee, Junkyu (University of California, Irvine) | Ihler, Alexander (University of California, Irvine) | Dechter, Rina (University of California, Irvine)

AAAI ConferencesApr-6-2018

We introduce a generalized dual decomposition bound for computing the maximum expected utility of influence diagrams based on the dual decomposition method generalized to $L^p$ space. The main goal is to devise an approximation scheme free from translations required by existing variational approaches while exploiting the local structure of sum of utility functions as well as the conditional independence of probability functions. In this work, the generalized dual decomposition method is applied to the algebraic framework called valuation algebra for influence diagrams which handles probability and expected utility as a pair. The proposed approach allows a sequential decision problem to be decomposed as a collection of sub-decision problems of bounded complexity and the upper bound of maximum expected utility to be computed by combining the local expected utilities. Thus, it has a flexible control of space and time complexity for computing the bound. In addition, the upper bounds can be further minimized by reparameterizing the utility functions. Since the global objective function for the minimization is nonconvex, we present a gradient-based local search algorithm in which the outer loop controls the randomization of the initial configurations and the inner loop tightens the upper-bound based on block coordinate descent with gradients perturbed by a random noise. The experimental evaluation demonstrates highlights of the proposed approach on finite horizon MDP/POMDP instances.

Add feedback

The Kanerva Machine: A Generative Distributed Memory

Wu, Yan, Wayne, Greg, Graves, Alex, Lillicrap, Timothy

arXiv.org Machine LearningApr-5-2018

We present an end-to-end trained memory system that quickly adapts to new data and generates samples like them. Inspired by Kanerva's sparse distributed memory, it has a robust distributed reading and writing mechanism. The memory is analytically tractable, which enables optimal on-line compression via a Bayesian update-rule. We formulate it as a hierarchical conditional generative model, where memory provides a rich data-dependent prior distribution. Consequently, the top-down memory and bottom-up perception are combined to produce the code representing an observation. Empirically, we demonstrate that the adaptive memory significantly improves generative models trained on both the Omniglot and CIFAR datasets. Compared with the Differentiable Neural Computer (DNC) and its variants, our memory model has greater capacity and is significantly easier to train.

conference paper, generative model, kanerva machine, (15 more...)

arXiv.org Machine Learning

1804.01756

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Add feedback

Information Maximizing Exploration with a Latent Dynamics Model

Barron, Trevor, Obst, Oliver, Amor, Heni Ben

arXiv.org Machine LearningApr-4-2018

All reinforcement learning algorithms must handle the trade-off between exploration and exploitation. Many state-of-the-art deep reinforcement learning methods use noise in the action selection, such as Gaussian noise in policy gradient methods or $\epsilon$-greedy in Q-learning. While these methods are appealing due to their simplicity, they do not explore the state space in a methodical manner. We present an approach that uses a model to derive reward bonuses as a means of intrinsic motivation to improve model-free reinforcement learning. A key insight of our approach is that this dynamics model can be learned in the latent feature space of a value function, representing the dynamics of the agent and the environment. This method is both theoretically grounded and computationally advantageous, permitting the efficient use of Bayesian information-theoretic methods in high-dimensional state spaces. We evaluate our method on several continuous control tasks, focusing on improving exploration.

bayesian inference, upstream oil & gas, value function, (19 more...)

arXiv.org Machine Learning

1804.01238

Country: North America > United States (0.68)

Genre: Research Report (0.41)

Industry: Energy > Oil & Gas > Upstream (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Gaussian Process Subset Scanning for Anomalous Pattern Detection in Non-iid Data

Herlands, William, McFowland, Edward III, Wilson, Andrew Gordon, Neill, Daniel B.

arXiv.org Machine LearningApr-4-2018

Identifying anomalous patterns in real-world data is essential for understanding where, when, and how systems deviate from their expected dynamics. Yet methods that separately consider the anomalousness of each individual data point have low detection power for subtle, emerging irregularities. Additionally, recent detection techniques based on subset scanning make strong independence assumptions and suffer degraded performance in correlated data. We introduce methods for identifying anomalous patterns in non-iid data by combining Gaussian processes with novel log-likelihood ratio statistic and subset scanning techniques. Our approaches are powerful, interpretable, and can integrate information across multiple data streams. We illustrate their performance on numeric simulations and three open source spatiotemporal datasets of opioid overdose deaths, 311 calls, and storm reports.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Machine Learning

1804.01466

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > New York > Bronx County > New York City (0.14)
North America > United States > New York > Richmond County > New York City (0.04)
(5 more...)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Education (0.93)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Addiction Disorder (0.49)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
(2 more...)

Add feedback

An Imprecise Probabilistic Estimator for the Transition Rate Matrix of a Continuous-Time Markov Chain

Krak, Thomas, Erreygers, Alexander, De Bock, Jasper

arXiv.org Machine LearningApr-4-2018

We consider the problem of estimating the transition rate matrix of a continuous-time Markov chain from a finite-duration realisation of this process. We approach this problem in an imprecise probabilistic framework, using a set of prior distributions on the unknown transition rate matrix. The resulting estimator is a set of transition rate matrices that, for reasons of conjugacy, is easy to find. To determine the hyperparameters for our set of priors, we reconsider the problem in discrete time, where we can use the well-known Imprecise Dirichlet Model. In particular, we show how the limit of the resulting discrete-time estimators is a continuous-time estimator. It corresponds to a specific choice of hyperparameters and has an exceptionally simple closed-form expression.

estimator, markov chain, matrix, (15 more...)

arXiv.org Machine Learning

1804.0133

Country:

Asia > Japan (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.63)

Add feedback

You Must Have Clicked on this Ad by Mistake! Data-Driven Identification of Accidental Clicks on Mobile Ads with Applications to Advertiser Cost Discounting and Click-Through Rate Prediction

Tolomei, Gabriele, Lalmas, Mounia, Farahat, Ayman, Haines, Andrew

arXiv.org Machine LearningApr-3-2018

In the cost per click (CPC) pricing model, an advertiser pays an ad network only when a user clicks on an ad; in turn, the ad network gives a share of that revenue to the publisher where the ad was impressed. Still, advertisers may be unsatisfied with ad networks charging them for "valueless" clicks, or so-called accidental clicks. [...] Charging advertisers for such clicks is detrimental in the long term as the advertiser may decide to run their campaigns on other ad networks. In addition, machine-learned click models trained to predict which ad will bring the highest revenue may overestimate an ad click-through rate, and as a consequence negatively impacting revenue for both the ad network and the publisher. In this work, we propose a data-driven method to detect accidental clicks from the perspective of the ad network. We collect observations of time spent by users on a large set of ad landing pages - i.e., dwell time. We notice that the majority of per-ad distributions of dwell time fit to a mixture of distributions, where each component may correspond to a particular type of clicks, the first one being accidental. We then estimate dwell time thresholds of accidental clicks from that component. Using our method to identify accidental clicks, we then propose a technique that smoothly discounts the advertiser's cost of accidental clicks at billing time. Experiments conducted on a large dataset of ads served on Yahoo mobile apps confirm that our thresholds are stable over time, and revenue loss in the short term is marginal. We also compare the performance of an existing machine-learned click model trained on all ad clicks with that of the same model trained only on non-accidental clicks. There, we observe an increase in both ad click-through rate (+3.9%) and revenue (+0.2%) on ads served by the Yahoo Gemini network when using the latter. [...]

artificial intelligence, information management, machine learning, (19 more...)

arXiv.org Machine Learning

1804.06912

Country: North America > United States > New York (0.15)

Genre: Research Report > Experimental Study (0.46)

Industry:

Marketing (1.00)
Information Technology > Services (1.00)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.93)
(2 more...)

Add feedback

Large-Scale Cox Process Inference using Variational Fourier Features

John, S. T., Hensman, James

arXiv.org Machine LearningApr-3-2018

Gaussian process modulated Poisson processes provide a flexible framework for modelling spatiotemporal point patterns. So far this had been restricted to one dimension, binning to a pre-determined grid, or small data sets of up to a few thousand data points. Here we introduce Cox process inference based on Fourier features. This sparse representation induces global rather than local constraints on the function space and is computationally efficient. This allows us to formulate a grid-free approximation that scales well with the number of data points and the size of the domain. We demonstrate that this allows MCMC approximations to the non-Gaussian posterior. We also find that, in practice, Fourier features have more consistent optimization behavior than previous approaches. Our approximate Bayesian method can fit over 100,000 events with complex spatiotemporal patterns in three dimensions on a single GPU.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

1804.01016

Genre: Research Report (0.40)

Industry: Transportation > Passenger (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.91)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

How a Defense of Christianity Revolutionized Brain Science - Facts So Romantic

NautilusApr-2-2018, 09:40:06 GMT

Presbyterian reverend Thomas Bayes had no reason to suspect he'd make any lasting contribution to humankind. Born in England at the beginning of the 18th century, Bayes was a quiet and questioning man. He published only two works in his lifetime. In 1731, he wrote a defense of God's--and the British monarchy's--"divine benevolence," and in 1736, an anonymous defense of the logic of Isaac Newton's calculus. Yet an argument he wrote before his death in 1761 would shape the course of history.

artificial intelligence, brain, machine learning, (16 more...)

Nautilus

Country:

North America > United States (0.48)
Europe > United Kingdom > England (0.35)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.51)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Add feedback

Collaborative targeted minimum loss inference from continuously indexed nuisance parameter estimators

Ju, Cheng, Chambaz, Antoine, van der Laan, Mark J.

arXiv.org Machine LearningMar-30-2018

Suppose that we wish to infer the value of a statistical parameter at a law from which we sample independent observations. Suppose that this parameter is smooth and that we can define two variation-independent, infinite-dimensional features of the law, its so called Q- and G-components (comp.), such that if we estimate them consistently at a fast enough product of rates, then we can build a confidence interval (CI) with a given asymptotic level based on a plain targeted minimum loss estimator (TMLE). The estimators of the Q- and G-comp. would typically be by products of machine learning algorithms. We focus on the case that the machine learning algorithm for the G-comp. is fine-tuned by a real-valued parameter h. Then, a plain TMLE with an h chosen by cross-validation would typically not lend itself to the construction of a CI, because the selection of h would trade-off its empirical bias with something akin to the empirical variance of the estimator of the G-comp. as opposed to that of the TMLE. A collaborative TMLE (C-TMLE) might, however, succeed in achieving the relevant trade-off. We construct a C-TMLE and show that, under high-level empirical processes conditions, and if there exists an oracle h that makes a bulky remainder term asymptotically Gaussian, then the C-TMLE is asymptotically Gaussian hence amenable to building a CI provided that its asymptotic variance can be estimated too. We illustrate the construction and main result with the inference of the average treatment effect, where the Q-comp. consists in a marginal law and a conditional expectation, and the G-comp. is a propensity score (a conditional probability). We also conduct a multi-faceted simulation study to investigate the empirical properties of the collaborative TMLE when the G-comp. is estimated by the LASSO. Here, h is the bound on the l1-norm of the candidate coefficients.

artificial intelligence, estimator, machine learning, (18 more...)

arXiv.org Machine Learning

1804.00102

Country:

North America > United States (0.45)
Europe > Austria (0.27)

Genre: Research Report > Experimental Study (0.67)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback