AITopics | Uncertainty

Collaborating Authors

Uncertainty

"AI systems–like people–must often act despite partial and uncertain information. First, the information received may be unreliable (e.g., a patient may mis-remember when a disease started, or may not have noticed a symptom that is important to a diagnosis). In addition, rules connecting real-world events can never include all the factors that might determine whether their conclusions really apply (e.g., the correctness of basing a diagnosis on a lab test depends whether there were conditions that might have caused a false positive, on the test being done correctly, on the results being associated with the right patient, etc.) Thus in order to draw useful conclusions, AI systems must be able to reason about the probability of events, given their current knowledge."
– from David Leake, Reasoning Under Uncertainty

News Overviews Instructional Materials AI-Alerts Classics

Uncertainty Estimation via Stochastic Batch Normalization

Atanov, Andrei, Ashukha, Arsenii, Molchanov, Dmitry, Neklyudov, Kirill, Vetrov, Dmitry

arXiv.org Machine LearningMar-20-2018

In this work, we investigate Batch Normalization technique and propose its probabilistic interpretation. We propose a probabilistic model and show that Batch Normalization maximazes the lower bound of its marginalized log-likelihood. Then, according to the new probabilistic model, we design an algorithm which acts consistently during train and test. However, inference becomes computationally inefficient. To reduce memory and computational cost, we propose Stochastic Batch Normalization -- an efficient approximation of proper inference procedure. This method provides us with a scalable uncertainty estimation technique. We demonstrate the performance of Stochastic Batch Normalization on popular architectures (including deep convolutional architectures: VGG-like and ResNets) for MNIST and CIFAR-10 datasets.

artificial intelligence, batch normalization, machine learning, (18 more...)

arXiv.org Machine Learning

1802.04893

Country: Oceania > Australia (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)

Add feedback

Predictor Variable Prioritization in Nonlinear Models: A Genetic Association Case Study

Crawford, Lorin, Flaxman, Seth R., Runcie, Daniel E., West, Mike

arXiv.org Machine LearningMar-20-2018

The central aim in this paper is to address variable selection questions in nonlinear and nonparametric regression. Motivated by statistical genetics, where nonlinear interactions are of particular interest, we introduce a novel, interpretable, and computationally efficient way to summarize the relative importance of predictor variables. Methodologically, we develop the "RelATive cEntrality" (RATE) measure to prioritize candidate genetic variants that are not just marginally important, but whose associations also stem from significant covarying relationships with other variants in the data. We illustrate RATE through Bayesian Gaussian process regression, but the methodological innovations apply to other nonlinear methods. It is known that nonlinear models often exhibit greater predictive accuracy than linear models, particularly for phenotypes generated by complex genetic architectures. With detailed simulations and an Arabidopsis thaliana QTL mapping study, we show that applying RATE enables an explanation for this improved performance.

data mining, machine learning, variant, (19 more...)

arXiv.org Machine Learning

1801.07318

Country:

Europe > United Kingdom > England (0.28)
North America > United States > California (0.28)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.48)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
(3 more...)

Add feedback

Copula Index for Detecting Dependence and Monotonicity between Stochastic Signals

Karra, Kiran, Mili, Lamine

arXiv.org Machine LearningMar-19-2018, 19:00:00 GMT

This paper introduces a nonparametric copula-based index for detecting the strength and monotonicity structure of linear and nonlinear statistical dependence between pairs of random variables or stochastic signals. Our index, termed Copula Index for Detecting Dependence and Monotonicity (CIM), satisfies several desirable properties of measures of association, including R\'enyi's properties, the data processing inequality (DPI), and consequently self-equitability. Synthetic data simulations reveal that the statistical power of CIM compares favorably to other state-of-the-art measures of association that are proven to satisfy the DPI. Simulation results with real-world data reveal the CIM's unique ability to detect the monotonicity structure among stochastic signals to find interesting dependencies in large datasets. Additionally, simulations show that the CIM shows favorable performance to estimators of mutual information when discovering Markov network structure.

artificial intelligence, dependency, machine learning, (19 more...)

arXiv.org Machine Learning

1703.06686

Country: North America > United States (1.00)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.46)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.49)

Add feedback

Momentum-Space Renormalization Group Transformation in Bayesian Image Modeling by Gaussian Graphical Model

Tanaka, Kazuyuki, Nakamura, Masamichi, Kataoka, Shun, Ohzeki, Masayuki, Yasuda, Muneki

arXiv.org Machine LearningMar-19-2018

A new Bayesian modeling method is proposed by combining the maximization of the marginal likelihood with a momentum-space renormalization group transformation for Gaussian graphical models. Moreover, we present a scheme for computint the statistical averages of hyperparameters and mean square errors in our proposed method based on a momentumspace renormalization transformation.

artificial intelligence, bayesian inference, machine learning, (8 more...)

arXiv.org Machine Learning

1804.00727

Country:

Asia > Japan > Honshū > Tōhoku (0.15)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.15)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.73)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.52)

Add feedback

Learning non-Gaussian Time Series using the Box-Cox Gaussian Process

Rios, Gonzalo, Tobar, Felipe

arXiv.org Machine LearningMar-19-2018

A Gaussian process (GP) [1] is a prior distribution over functions with a support that includes a wide class of phenomena via the design of its mean and covariance functions, the parameters of which provide meaningful interpretation of the process at hand. Beyond regression [2], GPs have been extensively used in the last two decades for classification [3], density estimation [4], filter design [5], model identification [6] and optimisation [7]. In general terms, all these generative models have two stages: The latent process is modelled as a GP and the observation is modelled (conditional to the latent process) as a non-Gaussian variable. This class of models is referred to as GP with non-Gaussian likelihood, or as Generalised GPs. These usually consider likelihood functions from the exponential family such as the Laplace, Poisson, beta and gamma distributions [8]. A well-known example is the GP classification model, where the classes are represented by the output of an activation neuron into which a latent GP is fed. A slightly different approach to non-Gaussian models, which is not constrained to the exponential family, is the warped GP (WGP, [9]). The WGP models non-Gaussian data by assuming that there is a transformation φ such that the observations can be passed through φ to yield a GP, therefore, the likelihood function of this model is not designed directly but, rather, induced by the transformation (a.k.a.

artificial intelligence, machine learning, modeling & simulation, (18 more...)

arXiv.org Machine Learning

1803.07102

Country: North America > United States (0.46)

Genre: Research Report (0.54)

Industry:

Government > Regional Government > North America Government > United States Government (0.46)
Banking & Finance > Economy (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.48)

Add feedback

Basics of Bayesian Decision Theory

@machinelearnbotMar-17-2018, 00:51:05 GMT

The use of formal statistical methods to analyse quantitative data in data science has increased considerably over the last few years. One such approach, Bayesian Decision Theory (BDT), also known as Bayesian Hypothesis Testing and Bayesian inference, is a fundamental statistical approach that quantifies the tradeoffs between various decisions using distributions and costs that accompany such decisions. In pattern recognition it is used for designing classifiers making the assumption that the problem is posed in probabilistic terms, and that all of the relevant probability values are known. Generally, we don't have such perfect information but it is a good place to start when studying machine learning, statistical inference, and detection theory in signal processing. BDT also has many applications in science, engineering, and medicine.

bayesian decision theory, bayesian inference, machine learning, (2 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.65)

Add feedback

Topology Estimation using Graphical Models in Multi-Phase Power Distribution Grids

Deka, Deepjyoti, Chertkov, Michael, Backhaus, Scott

arXiv.org Machine LearningMar-17-2018

Distribution grid is the medium and low voltage part of a large power system. Structurally, the majority of distribution networks operate radially, such that energized lines form a collection of trees, i.e. forest, with a substation being at the root of any tree. The operational topology/forest may change from time to time, however tracking these changes, even though important for the distribution grid operation and control, is hindered by limited real-time monitoring. This paper develops a learning framework to reconstruct radial operational structure of the distribution grid from synchronized voltage measurements in the grid subject to the exogenous fluctuations in nodal power consumption. To detect operational lines our learning algorithm uses conditional independence tests for continuous random variables that is applicable to a wide class of probability distributions of the nodal consumption and Gaussian injections in particular. Moreover, our algorithm applies to the practical case of unbalanced three-phase power flow. Algorithm performance is validated on AC power flow simulations over IEEE distribution grid test cases.

artificial intelligence, upstream oil & gas, voltage, (18 more...)

arXiv.org Machine Learning

1803.06531

Country:

North America > United States > New Mexico (0.14)
Asia (0.14)
Europe > Russia (0.14)

Genre: Research Report (0.40)

Industry:

Energy > Power Industry (1.00)
Energy > Oil & Gas > Upstream (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Convergence Rates of Latent Topic Models Under Relaxed Identifiability Conditions

Wang, Yining

arXiv.org Machine LearningMar-17-2018

In this paper we study the frequentist convergence rate for the Latent Dirichlet Allocation (Blei et al., 2003) topic models. We show that the maximum likelihood estimator converges to one of the finitely many equivalent parameters in Wasserstein's distance metric at a rate of $n^{-1/4}$ without assuming separability or non-degeneracy of the underlying topics and/or the existence of more than three words per document, thus generalizing the previous works of Anandkumar et al. (2012, 2014) from an information-theoretical perspective. We also show that the $n^{-1/4}$ convergence rate is optimal in the worst case.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

1710.1107

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Add feedback

Generative Bridging Network in Neural Sequence Prediction

Chen, Wenhu, Li, Guanlin, Ren, Shuo, Liu, Shujie, Zhang, Zhirui, Li, Mu, Zhou, Ming

arXiv.org Machine LearningMar-16-2018, 19:00:00 GMT

In order to alleviate data sparsity and over-fitting problems in maximum likelihood estimation (MLE) for sequence prediction tasks, we propose the Generative Bridging Network (GBN), in which a novel bridge module is introduced to assist the training of the sequence prediction model (the generator network). Unlike MLE directly maximizing the conditional likelihood, the bridge extends the point-wise ground truth to a bridge distribution conditioned on it, and the generator is optimized to minimize their KL-divergence. Three different GBNs, namely uniform GBN, language-model GBN and coaching GBN, are proposed to penalize confidence, enhance language smoothness and relieve learning burden. Experiments conducted on two recognized sequence prediction tasks (machine translation and abstractive text summarization) show that our proposed GBNs can yield significant improvements over strong baselines. Furthermore, by analyzing samples drawn from different bridges, expected influences on the generator are verified.

generator, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

1706.09152

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Impacts of Dirty Data: and Experimental Evaluation

Qi, Zhixin, Wang, Hongzhi, Li, Jianzhong, Gao, Hong

arXiv.org Machine LearningMar-16-2018

Data quality issues have attracted widespread attention due to the negative impacts of dirty data on data mining and machine learning results. The relationship between data quality and the accuracy of results could be applied on the selection of the appropriate algorithm with the consideration of data quality and the determination of the data share to clean. However, rare research has focused on exploring such relationship. Motivated by this, this paper conducts an experimental comparison for the effects of missing, inconsistent and conflicting data on classification, clustering, and regression algorithms. Based on the experimental findings, we provide guidelines for algorithm selection and data cleaning.

algorithm, regression, sensitive algorithm, (16 more...)

arXiv.org Machine Learning

1803.06071

Country:

North America > United States > Massachusetts > Norfolk County > Canton (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)

Genre: Research Report > New Finding (0.69)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
(3 more...)

Add feedback