AITopics | Uncertainty

Collaborating Authors

Uncertainty

"AI systems–like people–must often act despite partial and uncertain information. First, the information received may be unreliable (e.g., a patient may mis-remember when a disease started, or may not have noticed a symptom that is important to a diagnosis). In addition, rules connecting real-world events can never include all the factors that might determine whether their conclusions really apply (e.g., the correctness of basing a diagnosis on a lab test depends whether there were conditions that might have caused a false positive, on the test being done correctly, on the results being associated with the right patient, etc.) Thus in order to draw useful conclusions, AI systems must be able to reason about the probability of events, given their current knowledge."
– from David Leake, Reasoning Under Uncertainty

News Overviews Instructional Materials AI-Alerts Classics

Bayesian Nonparametric Multilevel Clustering with Group-Level Contexts

Nguyen, Vu, Phung, Dinh, Nguyen, XuanLong, Venkatesh, Svetha, Bui, Hung Hai

arXiv.org Machine LearningJan-28-2014

We present a Bayesian nonparametric framework for multilevel clustering which utilizes group-level context information to simultaneously discover low-dimensional structures of the group contents and partitions groups into clusters. Using the Dirichlet process as the building block, our model constructs a product base-measure with a nested structure to accommodate content and context observations at multiple levels. The proposed model possesses properties that link the nested Dirichlet processes (nDP) and the Dirichlet process mixture models (DPM) in an interesting way: integrating out all contents results in the DPM over contexts, whereas integrating out group-specific contexts results in the nDP mixture over content variables. We provide a Polya-urn view of the model and an efficient collapsed Gibbs inference procedure. Extensive experiments on real-world datasets demonstrate the advantage of utilizing context information via our model in both text and image domains.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

1401.1974

Country: North America > United States (1.00)

Genre: Research Report (0.64)

Industry: Consumer Products & Services > Restaurants (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.93)
(2 more...)

Add feedback

Community Detection in Networks using Graph Distance

Bhattacharyya, Sharmodeep, Bickel, Peter J.

arXiv.org Machine LearningJan-24-2014

The study of networks has received increased attention recently not only from the social sciences and statistics but also from physicists, computer scientists and mathematicians. One of the principal problem in networks is community detection. Many algorithms have been proposed for community finding but most of them do not have have theoretical guarantee for sparse networks and networks close to the phase transition boundary proposed by physicists. There are some exceptions but all have some incomplete theoretical basis. Here we propose an algorithm based on the graph distance of vertices in the network. We give theoretical guarantees that our method works in identifying communities for block models and can be extended for degree-corrected block models and block models with the number of communities growing with number of vertices. Despite favorable simulation results, we are not yet able to conclude that our method is satisfactory for worst possible case. We illustrate on a network of political blogs, Facebook networks and some other networks.

data mining, machine learning, vertex, (18 more...)

arXiv.org Machine Learning

1401.3915

Country: North America > United States (0.92)

Genre: Research Report (0.40)

Industry: Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
(2 more...)

Add feedback

Multimodal Transitions for Generative Stochastic Networks

Ozair, Sherjil, Yao, Li, Bengio, Yoshua

arXiv.org Machine LearningJan-24-2014

Generative Stochastic Networks (GSNs) have been recently introduced as an alternative to traditional probabilistic modeling: instead of parametrizing the data distribution directly, one parametrizes a transition operator for a Markov chain whose stationary distribution is an estimator of the data generating distribution. The result of training is therefore a machine that generates samples through this Markov chain. However, the previously introduced GSN consistency theorems suggest that in order to capture a wide class of distributions, the transition operator in general should be multimodal, something that has not been done before this paper. We introduce for the first time multimodal transition distributions for GSNs, in particular using models in the NADE family (Neural Autoregressive Density Estimator) as output distributions of the transition operator. A NADE model is related to an RBM (and can thus model multimodal distributions) but its likelihood (and likelihood gradient) can be computed easily. The parameters of the NADE are obtained as a learned function of the previous state of the learned Markov chain. Experiments clearly illustrate the advantage of such multimodal transition distributions over unimodal GSNs.

artificial intelligence, bengio, machine learning, (16 more...)

arXiv.org Machine Learning

1312.5578

Country: North America > Canada > Quebec (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.78)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)

Add feedback

Asymptotic Accuracy of Bayes Estimation for Latent Variables with Redundancy

Yamazaki, Keisuke

arXiv.org Machine LearningJan-23-2014

Hierarchical parametric models consisting of observable and latent variables are widely used for unsupervised learning tasks. For example, a mixture model is a representative hierarchical model for clustering. From the statistical point of view, the models can be regular or singular due to the distribution of data. In the regular case, the models have the identifiability; there is one-to-one relation between a probability density function for the model expression and the parameter. The Fisher information matrix is positive definite, and the estimation accuracy of both observable and latent variables has been studied. In the singular case, on the other hand, the models are not identifiable and the Fisher matrix is not positive definite. Conventional statistical analysis based on the inverse Fisher matrix is not applicable. Recently, an algebraic geometrical analysis has been developed and is used to elucidate the Bayes estimation of observable variables. The present paper applies this analysis to latent-variable estimation and determines its theoretical performance. Our results clarify behavior of the convergence of the posterior distribution. It is found that the posterior of the observable-variable estimation can be different from the one in the latent-variable estimation. Because of the difference, the Markov chain Monte Carlo method based on the parameter and the latent variable cannot construct the desired posterior distribution.

artificial intelligence, bayesian inference, machine learning, (20 more...)

arXiv.org Machine Learning

1205.3234

Country:

North America > United States (0.67)
Asia (0.46)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)

Add feedback

Gaussian-binary Restricted Boltzmann Machines on Modeling Natural Image Statistics

Wang, Nan, Melchior, Jan, Wiskott, Laurenz

arXiv.org Machine LearningJan-23-2014

We present a theoretical analysis of Gaussian-binary restricted Boltzmann machines (GRBMs) from the perspective of density models. The key aspect of this analysis is to show that GRBMs can be formulated as a constrained mixture of Gaussians, which gives a much better insight into the model's capabilities and limitations. We show that GRBMs are capable of learning meaningful features both in a two-dimensional blind source separation task and in modeling natural images. Further, we show that reported difficulties in training GRBMs are due to the failure of the training algorithm rather than the model itself. Based on our analysis we are able to propose several training recipes, which allowed successful and fast training in our experiments. Finally, we discuss the relationship of GRBMs to several modifications that have been proposed to improve the model.

artificial intelligence, grbm, machine learning, (19 more...)

arXiv.org Machine Learning

doi: 10.1371/journal.pone.0171015

1401.59

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)

Add feedback

Causal Discovery in a Binary Exclusive-or Skew Acyclic Model: BExSAM

Inazumi, Takanori, Washio, Takashi, Shimizu, Shohei, Suzuki, Joe, Yamamoto, Akihiro, Kawahara, Yoshinobu

arXiv.org Machine LearningJan-22-2014

Discovering causal relations among observed variables in a given data set is a major objective in studies of statistics and artificial intelligence. Recently, some techniques to discover a unique causal model have been explored based on non-Gaussianity of the observed data distribution. However, most of these are limited to continuous data. In this paper, we present a novel causal model for binary data and propose an efficient new approach to deriving the unique causal model governing a given binary data set under skew distributions of external binary noises. Experimental evaluation shows excellent performance for both artificial and real world data sets.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1401.5636

Country:

North America > United States (1.00)
Asia > Japan > Honshū > Kansai (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Hilbert Space Methods for Reduced-Rank Gaussian Process Regression

Solin, Arno, Särkkä, Simo

arXiv.org Machine LearningJan-21-2014

This paper proposes a novel scheme for reduced-rank Gaussian process regression. The method is based on an approximate series expansion of the covariance function in terms of an eigenfunction expansion of the Laplace operator in a compact subset of $\mathbb{R}^d$. On this approximate eigenbasis the eigenvalues of the covariance function can be expressed as simple functions of the spectral density of the Gaussian process, which allows the GP inference to be solved under a computational cost scaling as $\mathcal{O}(nm^2)$ (initial) and $\mathcal{O}(m^3)$ (hyperparameter learning) with $m$ basis functions and $n$ data points. The approach also allows for rigorous error analysis with Hilbert space theory, and we show that the approximation becomes exact when the size of the compact subset and the number of eigenfunctions go to infinity. The expansion generalizes to Hilbert spaces with an inner product which is defined as an integral over a specified input density. The method is compared to previously proposed methods theoretically and through empirical tests with simulated and real data.

artificial intelligence, covariance function, machine learning, (17 more...)

arXiv.org Machine Learning

1401.5508

Country: Europe > United Kingdom (0.46)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Modeling & Simulation (0.93)

Add feedback

On change point detection using the fused lasso method

Rojas, Cristian R., Wahlberg, Bo

arXiv.org Machine LearningJan-21-2014

In this paper we analyze the asymptotic properties of l1 penalized maximum likelihood estimation of signals with piece-wise constant mean values and/or variances. The focus is on segmentation of a non-stationary time series with respect to changes in these model parameters. This change point detection and estimation problem is also referred to as total variation denoising or l1 -mean filtering and has many important applications in most fields of science and engineering. We establish the (approximate) sparse consistency properties, including rate of convergence, of the so-called fused lasso signal approximator (FLSA). We show that this only holds if the sign of the corresponding consecutive changes are all different, and that this estimator is otherwise incapable of correctly detecting the underlying sparsity pattern. The key idea is to notice that the optimality conditions for this problem can be analyzed using techniques related to brownian bridge theory.

artificial intelligence, change point, machine learning, (18 more...)

arXiv.org Machine Learning

1401.5408

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)

Add feedback

Guaranteed Model Order Estimation and Sample Complexity Bounds for LDA

Gutiérrez, E. D.

arXiv.org Machine LearningJan-21-2014

The question of how to determine the number of independent latent factors, or topics, in Latent Dirichlet Allocation (LDA) is of great practical importance. In most applications, the exact number of topics is unknown, and depends on the application and the size of the data set. We introduce a spectral model selection procedure for topic number estimation that does not require learning the model's latent parameters beforehand and comes with probabilistic guarantees. The procedure is motivated by the spectral learning approach and relies on adaptations of results from random matrix theory. In a simulation experiment taken from the nonparametric Bayesian literature, this procedure outperforms the nonparametric Bayesian approach in both accuracy and speed. We also discuss some implications of our results for the sample complexity and accuracy of popular spectral learning algorithms for LDA. The principles underlying the procedure can be extended to spectral learning algorithms for other exchangeable mixture models with similar conditional independence properties.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

1312.2646

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Properties of Bethe Free Energies and Message Passing in Gaussian Models

Cseke, Botond, Heskes, Tom

arXiv.org Machine LearningJan-16-2014

We address the problem of computing approximate marginals in Gaussian probabilistic models by using mean field and fractional Bethe approximations. We define the Gaussian fractional Bethe free energy in terms of the moment parameters of the approximate marginals, derive a lower and an upper bound on the fractional Bethe free energy and establish a necessary condition for the lower bound to be bounded from below. It turns out that the condition is identical to the pairwise normalizability condition, which is known to be a sufficient condition for the convergence of the message passing algorithm. We show that stable fixed points of the Gaussian message passing algorithm are local minima of the Gaussian Bethe free energy. By a counterexample, we disprove the conjecture stating that the unboundedness of the free energy implies the divergence of the message passing algorithm.

artificial intelligence, bethe free energy, free energy, (14 more...)

arXiv.org Machine Learning

doi: 10.1613/jair.3195

1401.3877

Genre: Research Report (1.00)

Technology:

Information Technology > Architecture > Distributed Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.49)

Add feedback