AITopics | Uncertainty

Collaborating Authors

Uncertainty

"AI systems–like people–must often act despite partial and uncertain information. First, the information received may be unreliable (e.g., a patient may mis-remember when a disease started, or may not have noticed a symptom that is important to a diagnosis). In addition, rules connecting real-world events can never include all the factors that might determine whether their conclusions really apply (e.g., the correctness of basing a diagnosis on a lab test depends whether there were conditions that might have caused a false positive, on the test being done correctly, on the results being associated with the right patient, etc.) Thus in order to draw useful conclusions, AI systems must be able to reason about the probability of events, given their current knowledge."
– from David Leake, Reasoning Under Uncertainty

News Overviews Instructional Materials AI-Alerts Classics

Modelling Preference Data with the Wallenius Distribution

Grazian, Clara, Leisen, Fabrizio, Liseo, Brunero

arXiv.org Machine LearningFeb-7-2018

The Wallenius distribution is a generalisation of the Hypergeometric distribution where weights are assigned to balls of different colours. This naturally defines a model for ranking categories which can be used for classification purposes. Since, in general, the resulting likelihood is not analytically available, we adopt an approximate Bayesian computational (ABC) approach for estimating the importance of the categories. We illustrate the performance of the estimation procedure on simulated datasets. Finally, we use the new model for analysing two datasets about movies ratings and Italian academic statisticians' journal preferences. The latter is a novel dataset collected by the authors.

artificial intelligence, machine learning, wallenius distribution, (18 more...)

arXiv.org Machine Learning

1701.08142

Country: Europe > United Kingdom > England (0.28)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment (1.00)
Media > Film (0.69)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.69)
Information Technology > Data Science (0.68)
(2 more...)

Add feedback

Bayesian Recurrent Neural Network Models for Forecasting and Quantifying Uncertainty in Spatial-Temporal Data

McDermott, Patrick L., Wikle, Christopher K.

arXiv.org Machine LearningFeb-6-2018

Recurrent neural networks (RNNs) are nonlinear dynamical models commonly used in the machine learning and dynamical systems literature to represent complex dynamical or sequential relationships between variables. More recently, as deep learning models have become more common, RNNs have been used to forecast increasingly complicated systems. Dynamical spatio-temporal processes represent a class of complex systems that can potentially benefit from these types of models. Although the RNN literature is expansive and highly developed, uncertainty quantification is often ignored. Even when considered, the uncertainty is generally quantified without the use of a rigorous framework, such as a fully Bayesian setting. Here we attempt to quantify uncertainty in a more formal framework while maintaining the forecast accuracy that makes these models appealing, by presenting a Bayesian RNN model for nonlinear spatio-temporal forecasting. Additionally, we make simple modifications to the basic RNN to help accommodate the unique nature of nonlinear spatio-temporal data. The proposed model is applied to a Lorenz simulation and two real-world nonlinear spatio-temporal forecasting applications.

artificial intelligence, bast-rnn model, machine learning, (20 more...)

arXiv.org Machine Learning

1711.00636

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Banking & Finance > Economy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Bayesian Coreset Construction via Greedy Iterative Geodesic Ascent

Campbell, Trevor, Broderick, Tamara

arXiv.org Machine LearningFeb-5-2018

Coherent uncertainty quantification is a key strength of Bayesian methods. But modern algorithms for approximate Bayesian posterior inference often sacrifice accurate posterior uncertainty estimation in the pursuit of scalability. This work shows that previous Bayesian coreset construction algorithms---which build a small, weighted subset of the data that approximates the full dataset---are no exception. We demonstrate that these algorithms scale the coreset log-likelihood suboptimally, resulting in underestimated posterior uncertainty. To address this shortcoming, we develop greedy iterative geodesic ascent (GIGA), a novel algorithm for Bayesian coreset construction that scales the coreset log-likelihood optimally. GIGA provides geometric decay in posterior approximation error as a function of coreset size, and maintains the fast running time of its predecessors. The paper concludes with validation of GIGA on both synthetic and real datasets, demonstrating that it reduces posterior approximation error by orders of magnitude compared with previous coreset constructions.

artificial intelligence, bayesian inference, machine learning, (12 more...)

arXiv.org Machine Learning

1802.01737

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)

Add feedback

Weakly-supervised Dictionary Learning

You, Zeyu, Raich, Raviv, Fern, Xiaoli Z., Kim, Jinsub

arXiv.org Machine LearningFeb-5-2018

We present a probabilistic modeling and inference framework for discriminative analysis dictionary learning under a weak supervision setting. Dictionary learning approaches have been widely used for tasks such as low-level signal denoising and restoration as well as high-level classification tasks, which can be applied to audio and image analysis. Synthesis dictionary learning aims at jointly learning a dictionary and corresponding sparse coefficients to provide accurate data representation. This approach is useful for denoising and signal restoration, but may lead to sub-optimal classification performance. By contrast, analysis dictionary learning provides a transform that maps data to a sparse discriminative representation suitable for classification. We consider the problem of analysis dictionary learning for time-series data under a weak supervision setting in which signals are assigned with a global label instead of an instantaneous label signal. We propose a discriminative probabilistic model that incorporates both label information and sparsity constraints on the underlying latent instantaneous label signal using cardinality control. We present the expectation maximization (EM) procedure for maximum likelihood estimation (MLE) of the proposed model. To facilitate a computationally efficient E-step, we propose both a chain and a novel tree graph reformulation of the graphical model. The performance of the proposed model is demonstrated on both synthetic and real-world data.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Machine Learning

1802.01709

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Fast and accurate approximation of the full conditional for gamma shape parameters

Miller, Jeffrey W.

arXiv.org Machine LearningFeb-5-2018

The gamma distribution arises frequently in Bayesian models, but there is not an easy-to-use conjugate prior for the shape parameter of a gamma. This inconvenience is usually dealt with by using either Metropolis-Hastings moves, rejection sampling methods, or numerical integration. However, in models with a large number of shape parameters, these existing methods are slower or more complicated than one would like, making them burdensome in practice. It turns out that the full conditional distribution of the gamma shape parameter is well approximated by a gamma distribution, even for small sample sizes. This article introduces a quick and easy algorithm for finding a gamma distribution that approximates the full conditional distribution of the shape parameter. We empirically demonstrate the speed and accuracy of the approximation across a wide range of conditions. If exactness is required, the approximation can be used as a proposal distribution for Metropolis-Hastings.

approximation, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1802.0161

Genre: Research Report (0.90)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Add feedback

Counting and Uniform Sampling from Markov Equivalent DAGs

Ghassami, AmirEmad, Salehkaleybar, Saber, Kiyavash, Negar

arXiv.org Machine LearningFeb-4-2018

Directed acyclic graphs (DAGs) are the most commonly used graphical model to represent causal relationships among a set of variables. In a DAG representation, a directed edge indicates a direct causal relationship between the corresponding variables. Under Markov property and faithfulness assumptions, conditional d-separation of variables in a DAG is in bijective correspondence with conditional independencies of the variables in the underlying joint probability distribution (Spirtes et al., 2000), and hence, a DAG representation demonstrates conditional independencies among its variables. The general approach for learning a causal structure is to use statistical data from the variables to find a DAG which is the most consistent with the conditional independencies in the given data. However, a DAG representation of a set of conditional independencies is not always unique. 1 This restricts the learning of the causal structure to Markov equivalence classes (MECs), where elements of each class represent the same set of conditional independencies.

artificial intelligence, graph, machine learning, (17 more...)

arXiv.org Machine Learning

1802.01239

Country: North America > United States > Illinois (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.34)

Add feedback

On the Minimax Misclassification Ratio of Hypergraph Community Detection

Chien, I, Lin, Chung-Yi, Wang, I-Hsiang

arXiv.org Machine LearningFeb-3-2018

Community detection in hypergraphs is explored. Under a generative hypergraph model called "d-wise hypergraph stochastic block model" (d-hSBM) which naturally extends the Stochastic Block Model from graphs to d-uniform hypergraphs, the asymptotic minimax mismatch ratio is characterized. For proving the achievability, we propose a two-step polynomial time algorithm that achieves the fundamental limit. The first step of the algorithm is a hypergraph spectral clustering method which achieves partial recovery to a certain precision level. The second step is a local refinement method which leverages the underlying probabilistic model along with parameter estimation from the outcome of the first step. To characterize the asymptotic performance of the proposed algorithm, we first derive a sufficient condition for attaining weak consistency in the hypergraph spectral clustering step. Then, under the guarantee of weak consistency in the first step, we upper bound the worst-case risk attained in the local refinement step by an exponentially decaying function of the size of the hypergraph and characterize the decaying rate. For proving the converse, the lower bound of the minimax mismatch ratio is set by finding a smaller parameter space which contains the most dominant error events, inspired by the analysis in the achievability part. It turns out that the minimax mismatch ratio decays exponentially fast to zero as the number of nodes tends to infinity, and the rate function is a weighted combination of several divergence terms, each of which is the Renyi divergence of order 1/2 between two Bernoulli's. The Bernoulli's involved in the characterization of the rate function are those governing the random instantiation of hyperedges in d-hSBM. Experimental results on synthetic data validate our theoretical finding that the refinement step is critical in achieving the optimal statistical limit.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

1802.00926

Country:

Europe (0.45)
North America > United States (0.28)

Genre: Workflow (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.66)

Add feedback

Understanding Objective Functions in Neural Networks

@machinelearnbotFeb-2-2018, 23:56:14 GMT

The main inspiration for this blog post is based on the work I did on Bayesian Neural Networks with my friend Brian Trippe at the Computational and Biological Learning Lab in Cambridge University. I highly recommend anyone to read Brian's thesis on variational inference in neural networks. Disclaimer: At the Computational and Biological Learning Lab Bayesian machine learning techniques are unapologetically taught as the way forward. As such, be aware of potential bias in this blog post. For example in image classification, x represents an image and y the corresponding image label.

artificial intelligence, bayesian inference, machine learning, (16 more...)

@machinelearnbot

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.25)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.33)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.30)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.30)

Add feedback

Bayesian Renewables Scenario Generation via Deep Generative Networks

Chen, Yize, Li, Pan, Zhang, Baosen

arXiv.org Machine LearningFeb-2-2018

We present a method to generate renewable scenarios using Bayesian probabilities by implementing the Bayesian generative adversarial network~(Bayesian GAN), which is a variant of generative adversarial networks based on two interconnected deep neural networks. By using a Bayesian formulation, generators can be constructed and trained to produce scenarios that capture different salient modes in the data, allowing for better diversity and more accurate representation of the underlying physical process. Compared to conventional statistical models that are often hard to scale or sample from, this method is model-free and can generate samples extremely efficiently. For validation, we use wind and solar times-series data from NREL integration data sets to train the Bayesian GAN. We demonstrate that proposed method is able to generate clusters of wind scenarios with different variance and mean value, and is able to distinguish and generate wind and solar scenarios simultaneously even if the historical data are intentionally mixed.

artificial intelligence, bayesian inference, machine learning, (15 more...)

arXiv.org Machine Learning

1802.00868

Country: North America > United States (0.69)

Genre: Research Report (0.50)

Industry: Energy > Renewable > Wind (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Add feedback

VIBNN: Hardware Acceleration of Bayesian Neural Networks

Cai, Ruizhe, Ren, Ao, Liu, Ning, Ding, Caiwen, Wang, Luhao, Qian, Xuehai, Pedram, Massoud, Wang, Yanzhi

arXiv.org Machine LearningFeb-2-2018

Bayesian Neural Networks (BNNs) have been proposed to address the problem of model uncertainty in training and inference. By introducing weights associated with conditioned probability distributions, BNNs are capable of resolving the overfitting issue commonly seen in conventional neural networks and allow for smalldata training, through the variational inference process. Frequent usage of Gaussian random variables in this process requires a properly optimized Gaussian Random Number Generator (GRNG). The high hardware cost of conventional GRNG makes the hardware implementation of BNNs challenging. In this paper, we propose VIBNN, an FPGA-based hardware accelerator design for variational inference on BNNs. We explore the design space for massive amount of Gaussian variable sampling tasks in BNNs. Specifically, we introduce two high performance Gaussian (pseudo) random number generators: 1) the RAMbased Linear Feedback Gaussian Random Number Generator (RLF-GRNG), which is inspired by the properties of binomial distribution and linear feedback logics; and 2) the Bayesian Neural Network-oriented Wallace Gaussian Random Number Generator. To achieve high scalability and efficient memory access, we propose a deep pipelined accelerator architecture with fast execution and good hardware utilization. Experimental results demonstrate that the proposed VIBNN implementations on an FPGA can achieve throughput of 321,543.4

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

doi: 10.1145/3173162.3173212

1802.00822

Country: North America > United States > California > Los Angeles County > Los Angeles (0.28)

Genre: Research Report > New Finding (0.34)

Industry:

Information Technology (0.68)
Health & Medicine > Therapeutic Area > Oncology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback