AITopics | Uncertainty

Collaborating Authors

Uncertainty

"AI systems–like people–must often act despite partial and uncertain information. First, the information received may be unreliable (e.g., a patient may mis-remember when a disease started, or may not have noticed a symptom that is important to a diagnosis). In addition, rules connecting real-world events can never include all the factors that might determine whether their conclusions really apply (e.g., the correctness of basing a diagnosis on a lab test depends whether there were conditions that might have caused a false positive, on the test being done correctly, on the results being associated with the right patient, etc.) Thus in order to draw useful conclusions, AI systems must be able to reason about the probability of events, given their current knowledge."
– from David Leake, Reasoning Under Uncertainty

News Overviews Instructional Materials AI-Alerts Classics

Extrapolating Expected Accuracies for Large Multi-Class Problems

Zheng, Charles, Achanta, Rakesh, Benjamini, Yuval

arXiv.org Machine LearningDec-27-2017

Many machine learning tasks are interested in recognizing or identifying an individual instance within a large set of possible candidates. These problems are usually modeled as multi-class classification problems, with a large and possibly complex label set. Leading examples include detecting the speaker from his voice patterns (Togneri and Pullella, 2011), identifying the author from her written text (Stamatatos et al., 2014), or labeling the object category from its image (Duygulu et al., 2002, Deng et al., 2010, Oquab et al., 2014). In all these examples, the algorithm observes an input x, and uses the classifier function h to guess the label y from a large label set S. 1 There are multiple practical challenges in developing classifiers for large label sets. Collecting high quality training data is perhaps the main obstacle, as the costs scale with the number of classes. It can be affordable to first collect data for a small set of classes, even if the long-term goal is to generalize to a larger set. Furthermore, classifier development can be accelerated by training first on fewer classes, as each training cycle may require substantially less resources. Indeed, due to interest in how small-set performance generalizes to larger sets, such comparisons can found in the literature (Oquab et al., 2014, Griffin et al., 2007). A natural question is: how does changing the size of the label set affect the classification accuracy?

accuracy, artificial intelligence, machine learning, (19 more...)

arXiv.org Machine Learning

1712.09713

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
(2 more...)

Add feedback

On Connecting Stochastic Gradient MCMC and Differential Privacy

Li, Bai, Chen, Changyou, Liu, Hao, Carin, Lawrence

arXiv.org Machine LearningDec-25-2017

Significant success has been realized recently on applying machine learning to real-world applications. There have also been corresponding concerns on the privacy of training data, which relates to data security and confidentiality issues. Differential privacy provides a principled and rigorous privacy guarantee on machine learning models. While it is common to design a model satisfying a required differential-privacy property by injecting noise, it is generally hard to balance the trade-off between privacy and utility. We show that stochastic gradient Markov chain Monte Carlo (SG-MCMC) -- a class of scalable Bayesian posterior sampling algorithms proposed recently -- satisfies strong differential privacy with carefully chosen step sizes. We develop theory on the performance of the proposed differentially-private SG-MCMC method. We conduct experiments to support our analysis and show that a standard SG-MCMC sampler without any modification (under a default setting) can reach state-of-the-art performance in terms of both privacy and utility on Bayesian learning.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

1712.09097

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

Bayesian Computational Analyses with R Udemy

@machinelearnbotDec-24-2017, 21:00:39 GMT

Bayesian Computational Analyses with R is an introductory course on the use and implementation of Bayesian modeling using R software. The Bayesian approach is an alternative to the "frequentist" approach where one simply takes a sample of data and makes inferences about the likely parameters of the population. In contrast, the Bayesian approach uses both likelihood functions and a sample of observed data (the'prior') to estimate the most likely values and distributions for the estimated population parameters (the'posterior'). The course is useful to anyone who wishes to learn about Bayesian concepts and is suited to both novice and intermediate Bayesian students and Bayesian practitioners. It is both a practical, "hands-on" course with many examples using R scripts and software, and is conceptual, as the course explains the Bayesian concepts. All materials, software, R scripts, slides, exercises and solutions are included with the course materials.

artificial intelligence, bayesian computational analysis, machine learning, (11 more...)

@machinelearnbot

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (0.40)
Education > Educational Setting > Online (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

On Statistical Optimality of Variational Bayes

Pati, Debdeep, Bhattacharya, Anirban, Yang, Yun

arXiv.org Machine LearningDec-24-2017

Variational inference [25, 7, 40] is now a well-established tool to approximate intractable posterior distributions in hierarchical multi-layered Bayesian models. The traditional Markov chain Monte Carlo (MCMC; [17]) approach of approximating distributions with intractable normalizing constants draws (correlated) samples according to a discrete-time Markov chain whose stationary distribution is the target distribution. Despite their success and popularity, MCMC methods can be slow to converge and lack scalability in big data problems and/or problems involving very many latent variables, which has fueled search for alternatives. In contrast to the sampling approach of MCMC, variational inference approaches the problem from an optimization viewpoint. First, a class of analytically tractable distributions, referred to as the variational family, is identified for the problem at hand. For example, in mean-field approximation, the set of parameters and latent variables is divided into blocks and the variational distribution is assumed to be independent across blocks.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

1712.08983

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.86)

Add feedback

An Approximate Bayesian Long Short-Term Memory Algorithm for Outlier Detection

Chen, Chao, Lin, Xiao, Terejanu, Gabriel

arXiv.org Machine LearningDec-23-2017

Abstract--Long Short-T erm Memory networks trained with gradient descent and back-propagation have received great success in various applications. However, point estimation of the weights of the networks is prone to over-fitting problems and lacks important uncertainty information associated with the estimation. However, exact Bayesian neural network methods are intractable and non-applicable for real-world applications. In this study, we propose an approximate estimation of the weights uncertainty using Ensemble Kalman Filter, which is easily scalable to a large number of weights. T o assess the proposed algorithm, we apply it to outlier detection in five real-world events retrieved from the Twitter platform. I NTRODUCTION The recent resurgence of neural network trained with back-propagation has established state-of-art results in a wide range of domains. However, backpropagation-based neural networks (NN) are associated with many disadvantages, including but not limited to, the lack of uncertainty estimation, tendency of overfitting small data, and tuning of many hyper-parameters.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1712.08773

Country: North America > United States > South Carolina (0.28)

Genre: Research Report (0.84)

Industry: Leisure & Entertainment > Sports (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Add feedback

Truncated Variational Expectation Maximization

Lücke, Jörg

arXiv.org Machine LearningDec-23-2017

We derive a novel variational expectation maximization approach based on truncated variational distributions. Truncated distributions are proportional to exact posteriors within a subset of a discrete state space and equal zero otherwise. The novel variational approach is realized by first generalizing the standard variational EM framework to include variational distributions with exact (`hard') zeros. A fully variational treatment of truncated distributions then allows for deriving novel and mathematically grounded results, which in turn can be used to formulate novel efficient algorithms to optimize the parameters of probabilistic generative models. We find the free energies which correspond to truncated distributions to be given by concise and efficiently computable expressions, while update equations for model parameters (M-steps) remain in their standard form. Furthermore, we obtain generic expressions for expectation values w.r.t. truncated distributions. Based on these observations, we show how efficient and easily applicable meta-algorithms can be formulated that guarantee a monotonic increase of the free energy. Example applications of the here derived framework provide novel theoretical results and learning procedures for latent variable models as well as mixture models including procedures to tightly couple sampling and variational optimization approaches. Furthermore, by considering a special case of truncated variational distributions, we can cleanly and fully embed the well-known `hard EM' approaches into the variational EM framework, and we show that `hard EM' (for models with discrete latents) provably optimizes a lower free energy bound of the data log-likelihood.

free energy, machine learning, natural language, (21 more...)

arXiv.org Machine Learning

1610.03113

Country: Europe (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Adaptive Stochastic Dual Coordinate Ascent for Conditional Random Fields

Priol, Rémi Le, Touati, Ahmed, Lacoste-Julien, Simon

arXiv.org Machine LearningDec-22-2017

This work investigates training Conditional Random Fields (CRF) by Stochastic Dual Coordinate Ascent (SDCA). SDCA enjoys a linear convergence rate and a strong empirical performance for independent classification problems. However, it has never been used to train CRF. Yet it benefits from an exact line search with a single marginalization oracle call, unlike previous approaches. In this paper, we adapt SDCA to train CRF and we enhance it with an adaptive non-uniform sampling strategy. Our preliminary experiments suggest that this method matches state-of-the-art CRF optimization techniques.

artificial intelligence, machine learning, neural information processing system, (14 more...)

arXiv.org Machine Learning

1712.08577

Genre: Research Report > Experimental Study (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.47)

Add feedback

Boosted Generative Models

Grover, Aditya, Ermon, Stefano

arXiv.org Artificial IntelligenceDec-22-2017

We propose a novel approach for using unsupervised boosting to create an ensemble of generative models, where models are trained in sequence to correct earlier mistakes. Our meta-algorithmic framework can leverage any existing base learner that permits likelihood evaluation, including recent deep expressive models. Further, our approach allows the ensemble to include discriminative models trained to distinguish real data from model-generated data. We show theoretical conditions under which incorporating a new model in the ensemble will improve the fit and empirically demonstrate the effectiveness of our black-box boosting algorithms on density estimation, classification, and sample generation on benchmark datasets for a wide range of generative models.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

1702.08484

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.89)
(3 more...)

Add feedback

Model selection for Gaussian processes utilizing sensitivity of posterior predictive distribution

Paananen, Topi, Piironen, Juho, Andersen, Michael Riis, Vehtari, Aki

arXiv.org Machine LearningDec-21-2017

We propose two novel methods for simplifying Gaussian process (GP) models by examining the predictions of a full model in the vicinity of the training points and thereby ordering the covariates based on their predictive relevance. Our results on synthetic and real world data sets demonstrate improved variable selection compared to automatic relevance determination (ARD) in terms of consistency and predictive performance. We expect our proposed methods to be useful in interpreting and understanding complex Gaussian process models.

artificial intelligence, covariate, machine learning, (19 more...)

arXiv.org Machine Learning

1712.08048

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.83)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Inverse Ising problem in continuous time: A latent variable approach

Donner, Christian, Opper, Manfred

arXiv.org Machine LearningDec-21-2017

In recent years, the inverse Ising problem, i.e. the reconstruction of couplings and external fields of an Ising model from samples of spin configurations, has attracted considerable interest in the physics community [1]. This is due to the fact that Ising models play an important role for data modeling with applications to neural spike data [2, 3], protein structure determination [4], and gene expression analysis [5]. Much effort has been devoted to the development of algorithms for the static inverse Ising problem. This is a nontrivial task, because statistically efficient, likelihood based methods become computationally infeasible by the intractability of the partition function of the model. Hence one has to resort to either approximate inference methods or to other statistical estimators such as pseudo-likelihood methods [6], or the interaction screening algorithm [7]. The situation is somewhat simpler for the dynamical inverse Ising problem, which recently attracted attention [8-13]. If one assumes a Markovian dynamics, the exact normalisation of the spin transition probabilities allows for an explicit computation of the likelihood if one has a complete set of observed data over time. Nevertheless, the model parameters enter the likelihood in a fairly complex way, and the application of more advanced statistical approaches such as Bayesian inference again becomes a nontrivial task. This is especially true for the continuous time kinetic Ising model where the spins are governed by Glauber dynamics [14].

artificial intelligence, bayesian inference, machine learning, (15 more...)

arXiv.org Machine Learning

doi: 10.1103/PhysRevE.96.062104

1709.04495

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.70)

Add feedback