AITopics | Uncertainty

Collaborating Authors

Uncertainty

"AI systems–like people–must often act despite partial and uncertain information. First, the information received may be unreliable (e.g., a patient may mis-remember when a disease started, or may not have noticed a symptom that is important to a diagnosis). In addition, rules connecting real-world events can never include all the factors that might determine whether their conclusions really apply (e.g., the correctness of basing a diagnosis on a lab test depends whether there were conditions that might have caused a false positive, on the test being done correctly, on the results being associated with the right patient, etc.) Thus in order to draw useful conclusions, AI systems must be able to reason about the probability of events, given their current knowledge."
– from David Leake, Reasoning Under Uncertainty

News Overviews Instructional Materials AI-Alerts Classics

Gibbs Sampling Using Edward

#artificialintelligenceMay-31-2018, 01:50:59 GMT

Gibbs sampling is a MCMC method to draw samples from a complex distribution (usually a posterior in Bayesian inference). In this post I aim to show how to do Gibbs sampling using Edward, "a Python library for probabilistic modeling". If you are new to Edward, you can install the package by following up these steps. In above code x0 and x1 are two place holders for samples of X0X 0X0 and X1X 1X1 from previous iteration. Edward helped us to write Gibbs sampling with less than 10 line of codes.

bayesian inference, gibbs sampling, machine learning, (1 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Approximate Knowledge Compilation by Online Collapsed Importance Sampling

Friedman, Tal, Broeck, Guy Van den

arXiv.org Artificial IntelligenceMay-31-2018

We introduce collapsed compilation, a novel approximate inference algorithm for discrete probabilistic graphical models. It is a collapsed sampling algorithm that incrementally selects which variable to sample next based on the partial sample obtained so far. This online collapsing, together with knowledge compilation inference on the remaining variables, naturally exploits local structure and context- specific independence in the distribution. These properties are naturally exploited in exact inference, but are difficult to harness for approximate inference. More- over, by having a partially compiled circuit available during sampling, collapsed compilation has access to a highly effective proposal distribution for importance sampling. Our experimental evaluation shows that collapsed compilation performs well on standard benchmarks. In particular, when the amount of exact inference is equally limited, collapsed compilation is competitive with the state of the art, and outperforms it on several benchmarks.

artificial intelligence, compilation, machine learning, (17 more...)

arXiv.org Artificial Intelligence

1805.12565

Country: North America > United States > California > Los Angeles County > Los Angeles (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Fitting a deeply-nested hierarchical model to a large book review dataset using a moment-based estimator

Zhang, Ningshan, Schmaus, Kyle, Perry, Patrick O.

arXiv.org Machine LearningMay-31-2018

We consider a particular instance of a common problem in recommender systems: using a database of book reviews to inform user-targeted recommendations. In our dataset, books are categorized into genres and sub-genres. To exploit this nested taxonomy, we use a hierarchical model that enables information pooling across across similar items at many levels within the genre hierarchy. The main challenge in deploying this model is computational: the data sizes are large, and fitting the model at scale using off-the-shelf maximum likelihood procedures is prohibitive. To get around this computational bottleneck, we extend a moment-based fitting procedure proposed for fitting single-level hierarchical models to the general case of arbitrarily deep hierarchies. This extension is an order of magnetite faster than standard maximum likelihood procedures. The fitting method can be deployed beyond recommender systems to general contexts with deeply-nested hierarchical generalized linear mixed models.

artificial intelligence, hierarchy, machine learning, (20 more...)

arXiv.org Machine Learning

1806.02321

Country:

Europe (1.00)
North America > United States > New York (0.28)

Genre: Research Report (0.83)

Industry:

Health & Medicine (1.00)
Media (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)

Add feedback

Reparameterization Gradient for Non-differentiable Models

Lee, Wonyeol, Yu, Hangyeol, Yang, Hongseok

arXiv.org Machine LearningMay-31-2018

We present a new algorithm for stochastic variational inference that targets at models with non-differentiable densities. One of the key challenges in stochastic variational inference is to come up with a low-variance estimator of the gradient of a variational objective. We tackle the challenge by generalizing the reparameterization trick, one of the most effective techniques for addressing the variance issue for differentiable models, so that the trick works for non-differentiable models as well. Our algorithm splits the space of latent variables into regions where the density of the variables is differentiable, and their boundaries where the density may fail to be differentiable. For each differentiable region, the algorithm applies the standard reparameterization trick and estimates the gradient restricted to the region. For each potentially non-differentiable boundary, it uses a form of manifold sampling and computes the direction for variational parameters that, if followed, would increase the boundary's contribution to the variational objective. The sum of all the estimates becomes the gradient estimate of our algorithm. Our estimator enjoys the reduced variance of the reparameterization gradient while remaining unbiased even for non-differentiable models. The experiments with our preliminary implementation confirm the benefit of reduced variance and unbiasedness.

artificial intelligence, international conference, machine learning, (16 more...)

arXiv.org Machine Learning

1806.00176

Country: Asia > South Korea (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Add feedback

Neural Control Variates for Variance Reduction

Zhu, Zhanxing, Wan, Ruosi, Zhong, Mingjun

arXiv.org Machine LearningMay-31-2018

In statistics and machine learning, approximation of an intractable integration is often achieved by using the unbiased Monte Carlo estimator, but the variances of the estimation are generally high in many applications. Control variates approaches are well-known to reduce the variance of the estimation. These control variates are typically constructed by employing predefined parametric functions or polynomials, determined by using those samples drawn from the relevant distributions. Instead, we propose to construct those control variates by learning neural networks to handle the cases when test functions are complex. In many applications, obtaining a large number of samples for Monte Carlo estimation is expensive, which may result in overfitting when training a neural network. We thus further propose to employ auxiliary random variables induced by the original ones to extend data samples for training the neural networks. We apply the proposed control variates with augmented variables to thermodynamic integration and reinforcement learning. Experimental results demonstrate that our method can achieve significant variance reduction compared with other alternatives.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

1806.00159

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Asymptotic performance of regularized multi-task learning

Chen, Shaohan, Gao, Chuanhou

arXiv.org Machine LearningMay-31-2018

This paper analyzes asymptotic performance of a regularized multi-task learning model where task parameters are optimized jointly. If tasks are closely related, empirical work suggests multi-task learning models to outperform single-task ones in finite sample cases. As data size grows indefinitely, we show the learned multi-classifier to optimize an average misclassification error function which depicts the risk of applying multi-task learning algorithm to making decisions. This technique conclusion demonstrates the regularized multi-task learning model to be able to produce reliable decision rule for each task in the sense that it will asymptotically converge to the corresponding Bayes rule. Also, we find the interaction effect between tasks vanishes as data size growing indefinitely, which is quite different from the behavior in finite sample cases.

artificial intelligence, machine learning, misclassification error, (16 more...)

arXiv.org Machine Learning

1805.12507

Genre: Research Report > New Finding (0.35)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.47)

Add feedback

Agents and Devices: A Relative Definition of Agency

Orseau, Laurent, McGill, Simon McGregor, Legg, Shane

arXiv.org Machine LearningMay-31-2018

According to Dennett, the same system may be described using a `physical' (mechanical) explanatory stance, or using an `intentional' (belief- and goal-based) explanatory stance. Humans tend to find the physical stance more helpful for certain systems, such as planets orbiting a star, and the intentional stance for others, such as living animals. We define a formal counterpart of physical and intentional stances within computational theory: a description of a system as either a device, or an agent, with the key difference being that `devices' are directly described in terms of an input-output mapping, while `agents' are described in terms of the function they optimise. Bayes' rule can then be applied to calculate the subjective probability of a system being a device or an agent, based only on its behaviour. We illustrate this using the trajectories of an object in a toy grid-world domain.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1805.12387

Country: Europe (0.46)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Decision-Theoretic Meta-Learning: Versatile and Efficient Amortization of Few-Shot Learning

Gordon, Jonathan, Bronskill, John, Bauer, Matthias, Nowozin, Sebastian, Turner, Richard E.

arXiv.org Machine LearningMay-31-2018

This paper develops a general framework for data efficient and versatile deep learning. The new framework comprises three elements: 1) Discriminative probabilistic models from multi-task learning that leverage shared statistical information across tasks. 2) A novel Bayesian decision theoretic approach to meta-learning probabilistic inference across many tasks. 3) A fast, flexible, and simple to train amortization network that can automatically generalize and extrapolate to a wide range of settings. The VERSA algorithm, a particular instance of the framework, is evaluated on a suite of supervised few-shot learning tasks. VERSA achieves state-of-the-art performance in one-shot learning on Omniglot and miniImagenet, and produces compelling results on a one-shot ShapeNet view reconstruction challenge.

artificial intelligence, machine learning, versatile and efficient amortization, (2 more...)

arXiv.org Machine Learning

1805.09921

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.53)

Add feedback

Too Fast Causal Inference under Causal Insufficiency

Kłopotek, Mieczysław A.

arXiv.org Artificial IntelligenceMay-30-2018

Causally insufficient structures (models with latent or hidden variables, or with confounding etc.) of joint probability distributions have been subject of intense study not only in statistics, but also in various AI systems. In AI, belief networks, being representations of joint probability distribution with an underlying directed acyclic graph structure, are paid special attention due to the fact that efficient reasoning (uncertainty propagation) methods have been developed for belief network structures. Algorithms have been therefore developed to acquire the belief network structure from data. As artifacts due to variable hiding negatively influence the performance of derived belief networks, models with latent variables have been studied and several algorithms for learning belief network structure under causal insufficiency have also been developed. Regrettably, some of them are known already to be erroneous (e.g. IC algorithm of [Pearl:Verma:91]. This paper is devoted to another algorithm, the Fast Causal Inference (FCI) Algorithm of [Spirtes:93]. It is proven by a specially constructed example that this algorithm, as it stands in [Spirtes:93], is also erroneous. Fundamental reason for failure of this algorithm is the temporary introduction of non-real links between nodes of the network with the intention of later removal. While for trivial dependency structures these non-real links may be actually removed, this may not be the case for complex ones, e.g. for the case described in this paper. A remedy of this failure is proposed.

artificial intelligence, edge removal, machine learning, (13 more...)

arXiv.org Artificial Intelligence

1806.00352

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Bayesian Pose Graph Optimization via Bingham Distributions and Tempered Geodesic MCMC

Birdal, Tolga, Şimşekli, Umut, Eken, M. Onur, Ilic, Slobodan

arXiv.org Artificial IntelligenceMay-30-2018

The ability to navigate autonomously is now a key technology in self driving cars, unmanned aerial vehicles (UAV), robot guidance, augmented reality, 3D digitization, sensory network localization and more. This ubiquitous appliance is due to the fact that vision sensors can provide cues to directly solve 6DoF pose estimation problem and does not necessitate external tracking input, such as imprecise GPS, to ego-localize. Many of the problems in these domains can now be addressed by tailor-made pipelines such as SLAM (Simultaneous Localization and Mapping), SfM (Structure From Motion) or multi robot localization (MRL) [KPZK17, CC18]. Nowadays, thanks to the resulting reliable estimates of rotations and translations, many of these pipelines rely on some form of an optimization, such as bundle adjustment (BA) [TMHF99] or 3D global registration [BI17, HH03], that can globally consider the acquired measurements.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Artificial Intelligence

1805.12279

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Robotics & Automation (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback