AITopics | Uncertainty

Collaborating Authors

Uncertainty

"AI systems–like people–must often act despite partial and uncertain information. First, the information received may be unreliable (e.g., a patient may mis-remember when a disease started, or may not have noticed a symptom that is important to a diagnosis). In addition, rules connecting real-world events can never include all the factors that might determine whether their conclusions really apply (e.g., the correctness of basing a diagnosis on a lab test depends whether there were conditions that might have caused a false positive, on the test being done correctly, on the results being associated with the right patient, etc.) Thus in order to draw useful conclusions, AI systems must be able to reason about the probability of events, given their current knowledge."
– from David Leake, Reasoning Under Uncertainty

News Overviews Instructional Materials AI-Alerts Classics

Semi-Separable Hamiltonian Monte Carlo for Inference in Bayesian Hierarchical Models

Zhang, Yichuan, Sutton, Charles

Neural Information Processing SystemsDec-30-2014, 18:00:00 GMT

Sampling from hierarchical Bayesian models is often difficult for MCMC methods, because of the strong correlations between the model parameters and the hyperparameters. Recent Riemannian manifold Hamiltonian Monte Carlo (RMHMC) methods have significant potential advantages in this setting, but are computationally expensive. We introduce a new RMHMC method, which we call semi-separable Hamiltonian Monte Carlo, which uses a specially designed mass matrix that allows the joint Hamiltonian over model parameters and hyperparameters to decompose into two simpler Hamiltonians. This structure is exploited by a new integrator which we call the alternating blockwise leapfrog algorithm. The resulting method can mix faster than simpler Gibbs sampling while being simpler and more efficient than previous instances of RMHMC.

artificial intelligence, machine learning, monte carlo, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Add feedback

A Bayesian encourages dropout

Maeda, Shin-ichi

arXiv.org Machine LearningDec-30-2014

Dropout is one of the key techniques to prevent the learning from overfitting. It is explained that dropout works as a kind of modified L2 regularization. Here, we shed light on the dropout from Bayesian standpoint. Bayesian interpretation enables us to optimize the dropout rate, which is beneficial for learning of weight parameters and prediction after learning. The experiment result also encourages the optimization of the dropout.

algorithm, dropout, dropout rate, (16 more...)

arXiv.org Machine Learning

1412.7003

Country:

Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.05)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.95)

Add feedback

Accurate and Conservative Estimates of MRF Log-likelihood using Reverse Annealing

Burda, Yuri, Grosse, Roger B., Salakhutdinov, Ruslan

arXiv.org Machine LearningDec-30-2014

Markov random fields (MRFs) are difficult to evaluate as generative models because computing the test log-probabilities requires the intractable partition function. Annealed importance sampling (AIS) is widely used to estimate MRF partition functions, and often yields quite accurate results. However, AIS is prone to overestimate the log-likelihood with little indication that anything is wrong. We present the Reverse AIS Estimator (RAISE), a stochastic lower bound on the log-likelihood of an approximation to the original MRF model. RAISE requires only the same MCMC transition operators as standard AIS. Experimental results indicate that RAISE agrees closely with AIS log-probability estimates for RBMs, DBMs, and DBNs, but typically errs on the side of underestimating, rather than overestimating, the log-likelihood.

artificial intelligence, machine learning, rbm, (17 more...)

arXiv.org Machine Learning

1412.8566

Country: North America > Canada > Ontario > Toronto (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Add feedback

Inference for Sparse Conditional Precision Matrices

Wang, Jialei, Kolar, Mladen

arXiv.org Machine LearningDec-24-2014

Given $n$ i.i.d. observations of a random vector $(X,Z)$, where $X$ is a high-dimensional vector and $Z$ is a low-dimensional index variable, we study the problem of estimating the conditional inverse covariance matrix $\Omega(z) = (E[(X-E[X \mid Z])(X-E[X \mid Z])^T \mid Z=z])^{-1}$ under the assumption that the set of non-zero elements is small and does not depend on the index variable. We develop a novel procedure that combines the ideas of the local constant smoothing and the group Lasso for estimating the conditional inverse covariance matrix. A proximal iterative smoothing algorithm is used to solve the corresponding convex optimization problems. We prove that our procedure recovers the conditional independence assumptions of the distribution $X \mid Z$ with high probability. This result is established by developing a uniform deviation bound for the high-dimensional conditional covariance matrix from its population counterpart, which may be of independent interest. Furthermore, we develop point-wise confidence intervals for individual elements of the conditional inverse covariance matrix. We perform extensive simulation studies, in which we demonstrate that our proposed procedure outperforms sensible competitors. We illustrate our proposal on a S&P 500 stock price data set.

estimation, health & medicine, optimization problem, (20 more...)

arXiv.org Machine Learning

1412.7638

Country:

Asia > Middle East > Israel (0.14)
North America > United States > Oklahoma (0.14)

Genre: Research Report (0.84)

Industry:

Energy > Oil & Gas (0.93)
Banking & Finance > Trading (0.86)
Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.66)

Add feedback

Tutorial on Structured Continuous-Time Markov Processes

Shelton, C. R., Ciardo, G.

Journal of Artificial Intelligence ResearchDec-23-2014

A continuous-time Markov process (CTMP) is a collection of variables indexed by a continuous quantity, time. It obeys the Markov property that the distribution over a future variable is independent of past variables given the state at the present time. We introduce continuous-time Markov process representations and algorithms for filtering, smoothing, expected sufficient statistics calculations, and model estimation, assuming no prior knowledge of continuous-time processes but some basic knowledge of probability and statistics. We begin by describing "flat" or unstructured Markov processes and then move to structured Markov processes (those arising from state spaces consisting of assignments to variables) including Kronecker, decision-diagram, and continuous-time Bayesian network representations. We provide the first connection between decision-diagrams and continuous-time Bayesian networks.

ev mdd, matrix, transition, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.4415

AI Access Foundation

10921

Journal of Artificial Intelligence Research

Country:

Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
North America > United States > Iowa (0.04)
North America > United States > California > Riverside County > Riverside (0.04)
(5 more...)

Genre: Instructional Material > Course Syllabus & Notes (0.68)

Industry:

Information Technology (0.67)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Model Selection in High-Dimensional Misspecified Models

Basu, Pallavi, Feng, Yang, Lv, Jinchi

arXiv.org Machine LearningDec-23-2014

Model selection is indispensable to high-dimensional sparse modeling in selecting the best set of covariates among a sequence of candidate models. Most existing work assumes implicitly that the model is correctly specified or of fixed dimensions. Yet model misspecification and high dimensionality are common in real applications. In this paper, we investigate two classical Kullback-Leibler divergence and Bayesian principles of model selection in the setting of high-dimensional misspecified models. Asymptotic expansions of these principles reveal that the effect of model misspecification is crucial and should be taken into account, leading to the generalized AIC and generalized BIC in high dimensions. With a natural choice of prior probabilities, we suggest the generalized BIC with prior probability which involves a logarithmic factor of the dimensionality in penalizing model complexity. We further establish the consistency of the covariance contrast matrix estimator in a general setting. Our results and new method are supported by numerical studies.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1412.7468

Country: North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)

Add feedback

Parameter estimation in spherical symmetry groups

Chen, Yu-Hui, Wei, Dennis, Newstadt, Gregory, DeGraef, Marc, Simmons, Jeffrey, Hero, Alfred

arXiv.org Machine LearningDec-21-2014

This paper considers statistical estimation problems where the probability distribution of the observed random variable is invariant with respect to actions of a finite topological group. It is shown that any such distribution must satisfy a restricted finite mixture representation. When specialized to the case of distributions over the sphere that are invariant to the actions of a finite spherical symmetry group $\mathcal G$, a group-invariant extension of the Von Mises Fisher (VMF) distribution is obtained. The $\mathcal G$-invariant VMF is parameterized by location and scale parameters that specify the distribution's mean orientation and its concentration about the mean, respectively. Using the restricted finite mixture representation these parameters can be estimated using an Expectation Maximization (EM) maximum likelihood (ML) estimation algorithm. This is illustrated for the problem of mean crystal orientation estimation under the spherically symmetric group associated with the crystal form, e.g., cubic or octahedral or hexahedral. Simulations and experiments establish the advantages of the extended VMF EM-ML estimator for data acquired by Electron Backscatter Diffraction (EBSD) microscopy of a polycrystalline Nickel alloy sample.

artificial intelligence, machine learning, orientation, (17 more...)

arXiv.org Machine Learning

doi: 10.1109/LSP.2014.2387206

1411.254

Country: North America > United States > Michigan (0.28)

Genre: Research Report (0.40)

Industry: Materials (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.51)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Add feedback

Cauchy Principal Component Analysis

Xie, Pengtao, Xing, Eric

arXiv.org Machine LearningDec-19-2014

Principal Component Analysis (PCA) has wide applications in machine learning, text mining and computer vision. Classical PCA based on a Gaussian noise model is fragile to noise of large magnitude. Laplace noise assumption based PCA methods cannot deal with dense noise effectively. In this paper, we propose Cauchy Principal Component Analysis (Cauchy PCA), a very simple yet effective PCA method which is robust to various types of noise. We utilize Cauchy distribution to model noise and derive Cauchy PCA under the maximum likelihood estimation (MLE) framework with low rank constraint. Our method can robustly estimate the low rank matrix regardless of whether noise is large or small, dense or sparse. We analyze the robustness of Cauchy PCA from a robust statistics view and present an efficient singular value projection optimization method. Experimental results on both simulated data and real applications demonstrate the robustness of Cauchy PCA to various noise patterns.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

1412.6506

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.82)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.55)

Add feedback

A la Carte - Learning Fast Kernels

Yang, Zichao, Smola, Alexander J., Song, Le, Wilson, Andrew Gordon

arXiv.org Machine LearningDec-19-2014

The generalisation properties of a kernel method are entirely controlled by a kernel function, which represents an inner product of arbitrarily many basis functions. Kernel methods typically face a tradeoff between speed and flexibility. Methods which learn a kernel lead to slow and expensive to compute function classes, whereas many fast function classes are not adaptive. This problem is compounded by the fact that expressive kernel learning methods are most needed on large modern datasets, which provide unprecedented opportunities to automatically learn rich statistical representations. For example, the recent spectral kernels proposed by Wilson and Adams [2013] are flexible, but require an arbitrarily large number of basis functions, combined with many free hyperparameters, which can lead to major computational restrictions.

artificial intelligence, kernel, machine learning, (16 more...)

arXiv.org Machine Learning

1412.6493

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Kernel Methods (0.75)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Add feedback

From dependency to causality: a machine learning approach

Bontempi, Gianluca, Flauder, Maxime

arXiv.org Machine LearningDec-19-2014

The relationship between statistical dependency and causality lies at the heart of all statistical approaches to causal inference and can be summarized by two famous statements: correlation (or more generally statistical association) does not imply causation and causation induces a statistical dependency between causes and effects (or more generally descendants) ([26]). In other terms it is well known that statistical dependency is a necessary yet not sufficient condition for causality. The unidirectional link between these 1 two notions has been used by many formal approaches to causality to justify the adoption of statistical methods for detecting or inferring causal links from observational data. The most influential one is the Causal Bayesian Network approach, detailed in ([17]) which relies on notions of independence and conditional independence to detect causal patterns in the data. Well known examples of related inference algorithms are the constraint-based methods like the PC algorithms ([30]) and IC ([23]). These approaches are founded on probability theory and have been shown to be accurate in reconstructing causal patterns in many applications.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1412.6285

Country: Europe (0.28)

Genre: Research Report > Promising Solution (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback