AITopics | Uncertainty

Collaborating Authors

Uncertainty

"AI systems–like people–must often act despite partial and uncertain information. First, the information received may be unreliable (e.g., a patient may mis-remember when a disease started, or may not have noticed a symptom that is important to a diagnosis). In addition, rules connecting real-world events can never include all the factors that might determine whether their conclusions really apply (e.g., the correctness of basing a diagnosis on a lab test depends whether there were conditions that might have caused a false positive, on the test being done correctly, on the results being associated with the right patient, etc.) Thus in order to draw useful conclusions, AI systems must be able to reason about the probability of events, given their current knowledge."
– from David Leake, Reasoning Under Uncertainty

News Overviews Instructional Materials AI-Alerts Classics

Parsimonious Topic Models with Salient Word Discovery

Soleimani, Hossein, Miller, David J.

arXiv.org Machine LearningSep-11-2014

We propose a parsimonious topic model for text corpora. In related models such as Latent Dirichlet Allocation (LDA), all words are modeled topic-specifically, even though many words occur with similar frequencies across different topics. Our modeling determines salient words for each topic, which have topic-specific probabilities, with the rest explained by a universal shared model. Further, in LDA all topics are in principle present in every document. By contrast our model gives sparse topic representation, determining the (small) subset of relevant topics for each document. We derive a Bayesian Information Criterion (BIC), balancing model complexity and goodness of fit. Here, interestingly, we identify an effective sample size and corresponding penalty specific to each parameter type in our model. We minimize BIC to jointly determine our entire model -- the topic-specific words, document-specific topics, all model parameter values, {\it and} the total number of topics -- in a wholly unsupervised fashion. Results on three text corpora and an image dataset show that our model achieves higher test set likelihood and better agreement with ground-truth class labels, compared to LDA and to a model designed to incorporate sparsity.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

doi: 10.1109/TKDE.2014.2345378

1401.6169

Country: North America > United States > Pennsylvania (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.91)

Add feedback

Scalable Bayesian Modelling of Paired Symbols

Paquet, Ulrich, Koenigstein, Noam, Winther, Ole

arXiv.org Machine LearningSep-10-2014

We present a novel, scalable and Bayesian approach to modelling the occurrence of pairs of symbols (i, j) drawn from a large vocabulary. Observed pairs are assumed to be generated by a simple popularity based selection process followed by censoring using a preference function. By basing inference on the well-founded principle of variational bounding, and using new site-independent bounds, we show how a scalable inference procedure can be obtained for large data sets. State of the art results are presented on real-world movie viewing data.

artificial intelligence, likelihood, machine learning, (17 more...)

arXiv.org Machine Learning

1409.2824

Country: Europe (0.46)

Genre: Research Report (0.82)

Industry: Law > Civil Rights & Constitutional Law (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Add feedback

Context-specific independence in graphical log-linear models

Nyman, Henrik, Pensar, Johan, Koski, Timo, Corander, Jukka

arXiv.org Machine LearningSep-9-2014

Log-linear models are the popular workhorses of analyzing contingency tables. A log-linear parameterization of an interaction model can be more expressive than a direct parameterization based on probabilities, leading to a powerful way of defining restrictions derived from marginal, conditional and context-specific independence. However, parameter estimation is often simpler under a direct parameterization, provided that the model enjoys certain decomposability properties. Here we introduce a cyclical projection algorithm for obtaining maximum likelihood estimates of log-linear parameters under an arbitrary context-specific graphical log-linear model, which needs not satisfy criteria of decomposability. We illustrate that lifting the restriction of decomposability makes the models more expressive, such that additional context-specific independencies embedded in real data can be identified. It is also shown how a context-specific graphical model can correspond to a non-hierarchical log-linear parameterization with a concise interpretation. This observation can pave way to further development of non-hierarchical log-linear models, which have been largely neglected due to their believed lack of interpretability.

artificial intelligence, bayesian inference, machine learning, (20 more...)

arXiv.org Machine Learning

doi: 10.1007/s00180-015-0606-6

1409.2713

Country: Europe > Finland (0.15)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Ambiguity-Driven Fuzzy C-Means Clustering: How to Detect Uncertain Clustered Records

Ghaffari, Meysam, Ghadiri, Nasser

arXiv.org Artificial IntelligenceSep-9-2014

As a well-known clustering algorithm, Fuzzy C-Means (FCM) allows each input sample to belong to more than one cluster, providing more flexibility than non-fuzzy clustering methods. However, the accuracy of FCM is subject to false detections caused by noisy records, weak feature selection and low certainty of the algorithm in some cases. The false detections are very important in some decision-making application domains like network security and medical diagnosis, where weak decisions based on such false detections may lead to catastrophic outcomes. They are mainly emerged from making decisions about a subset of records that do not provide enough evidence to make a good decision. In this paper, we propose a method for detecting such ambiguous records in FCM by introducing a certainty factor to decrease invalid detections. This approach enables us to send the detected ambiguous records to another discrimination method for a deeper investigation, thus increasing the accuracy by lowering the error rate. Most of the records are still processed quickly and with low error rate which prevents performance loss compared to similar hybrid methods. Experimental results of applying the proposed method on several datasets from different domains show a significant decrease in error rate as well as improved sensitivity of the algorithm.

ad-fcm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s10489-016-0759-1

1409.2821

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Bayesian Discovery of Threat Networks

Smith, Steven T., Kao, Edward K., Senne, Kenneth D., Bernstein, Garrett, Philips, Scott

arXiv.org Machine LearningSep-8-2014

A novel unified Bayesian framework for network detection is developed, under which a detection algorithm is derived based on random walks on graphs. The algorithm detects threat networks using partial observations of their activity, and is proved to be optimum in the Neyman-Pearson sense. The algorithm is defined by a graph, at least one observation, and a diffusion model for threat. A link to well-known spectral detection methods is provided, and the equivalence of the random walk and harmonic solutions to the Bayesian formulation is proven. A general diffusion model is introduced that utilizes spatio-temporal relationships between vertices, and is used for a specific space-time formulation that leads to significant performance improvements on coordinated covert networks. This performance is demonstrated using a new hybrid mixed-membership blockmodel introduced to simulate random covert networks with realistic properties.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Machine Learning

doi: 10.1109/TSP.2014.2336613

1311.5552

Country: North America > United States > Massachusetts > Middlesex County (0.28)

Genre: Research Report (0.64)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Variational Inference for Uncertainty on the Inputs of Gaussian Process Models

Damianou, Andreas C., Titsias, Michalis K., Lawrence, Neil D.

arXiv.org Artificial IntelligenceSep-8-2014

The Gaussian process latent variable model (GP-LVM) provides a flexible approach for non-linear dimensionality reduction that has been widely applied. However, the current approach for training GP-LVMs is based on maximum likelihood, where the latent projection variables are maximized over rather than integrated out. In this paper we present a Bayesian method for training GP-LVMs by introducing a non-standard variational inference framework that allows to approximately integrate out the latent variables and subsequently train a GP-LVM by maximizing an analytic lower bound on the exact marginal likelihood. We apply this method for learning a GP-LVM from iid observations and for learning non-linear dynamical systems where the observations are temporally correlated. We show that a benefit of the variational Bayesian procedure is its robustness to overfitting and its ability to automatically select the dimensionality of the nonlinear latent space. The resulting framework is generic, flexible and easy to extend for other purposes, such as Gaussian process regression with uncertain inputs and semi-supervised Gaussian processes. We demonstrate our method on synthetic data and standard machine learning benchmarks, as well as challenging real world datasets, including high resolution video data.

artificial intelligence, bayesian inference, machine learning, (15 more...)

arXiv.org Artificial Intelligence

1409.2287

Country:

North America > United States (1.00)
Europe (0.67)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Improving self-calibration

Enßlin, Torsten A., Junklewitz, Henrik, Winderling, Lars, Greiner, Maksim, Selig, Marco

arXiv.org Machine LearningSep-6-2014

Response calibration is the process of inferring how much the measured data depend on the signal one is interested in. It is essential for any quantitative signal estimation on the basis of the data. Here, we investigate self-calibration methods for linear signal measurements and linear dependence of the response on the calibration parameters. The common practice is to augment an external calibration solution using a known reference signal with an internal calibration on the unknown measurement signal itself. Contemporary self-calibration schemes try to find a self-consistent solution for signal and calibration by exploiting redundancies in the measurements. This can be understood in terms of maximizing the joint probability of signal and calibration. However, the full uncertainty structure of this joint probability around its maximum is thereby not taken into account by these schemes. Therefore better schemes -- in sense of minimal square error -- can be designed by accounting for asymmetries in the uncertainty of signal and calibration. We argue that at least a systematic correction of the common self-calibration scheme should be applied in many measurement situations in order to properly treat uncertainties of the signal on which one calibrates. Otherwise the calibration solutions suffer from a systematic bias, which consequently distorts the signal reconstruction. Furthermore, we argue that non-parametric, signal-to-noise filtered calibration should provide more accurate reconstructions than the common bin averages and provide a new, improved self-calibration scheme. We illustrate our findings with a simplistic numerical example.

artificial intelligence, calibration, machine learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1103/PhysRevE.90.043301

1312.1349

Country: Europe > Germany (0.28)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)

Add feedback

Marginal Structured SVM with Hidden Variables

Ping, Wei, Liu, Qiang, Ihler, Alexander

arXiv.org Machine LearningSep-5-2014

In this work, we propose the marginal structured SVM (MSSVM) for structured prediction with hidden variables. MSSVM properly accounts for the uncertainty of hidden variables, and can significantly outperform the previously proposed latent structured SVM (LSSVM; Yu & Joachims (2009)) and other state-of-art methods, especially when that uncertainty is large. Our method also results in a smoother objective function, making gradient-based optimization of MSSVMs converge significantly faster than for LSSVMs. We also show that our method consistently outperforms hidden conditional random fields (HCRFs; Quattoni et al. (2007)) on both simulated and real-world datasets. Furthermore, we propose a unified framework that includes both our and several other existing methods as special cases, and provides insights into the comparison of different models in practice.

artificial intelligence, machine learning, prediction, (17 more...)

arXiv.org Machine Learning

1409.132

Country: North America > United States (0.68)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)
(2 more...)

Add feedback

Gradient density estimation in arbitrary finite dimensions using the method of stationary phase

Gurumoorthy, Karthik S., Rangarajan, Anand, Corring, John

arXiv.org Machine LearningSep-4-2014

We prove that the density function of the gradient of a sufficiently smooth function $S : \Omega \subset \mathbb{R}^d \rightarrow \mathbb{R}$, obtained via a random variable transformation of a uniformly distributed random variable, is increasingly closely approximated by the normalized power spectrum of $\phi=\exp\left(\frac{iS}{\tau}\right)$ as the free parameter $\tau \rightarrow 0$. The result is shown using the stationary phase approximation and standard integration techniques and requires proper ordering of limits. We highlight a relationship with the well-known characteristic function approach to density estimation, and detail why our result is distinct from this approach.

density function, machine learning, stationary point, (15 more...)

arXiv.org Machine Learning

1211.3038

Country: North America > United States (0.46)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

A Truncated EM Approach for Spike-and-Slab Sparse Coding

Sheikh, Abdul-Saboor, Shelton, Jacquelyn A., Lücke, Jörg

arXiv.org Machine LearningSep-3-2014

We study inference and learning based on a sparse coding model with `spike-and-slab' prior. As in standard sparse coding, the model used assumes independent latent sources that linearly combine to generate data points. However, instead of using a standard sparse prior such as a Laplace distribution, we study the application of a more flexible `spike-and-slab' distribution which models the absence or presence of a source's contribution independently of its strength if it contributes. We investigate two approaches to optimize the parameters of spike-and-slab sparse coding: a novel truncated EM approach and, for comparison, an approach based on standard factored variational distributions. The truncated approach can be regarded as a variational approach with truncated posteriors as variational distributions. In applications to source separation we find that both approaches improve the state-of-the-art in a number of standard benchmarks, which argues for the use of `spike-and-slab' priors for the corresponding data domains. Furthermore, we find that the truncated EM approach improves on the standard factored approach in source separation tasks$-$which hints to biases introduced by assuming posterior independence in the factored variational approach. Likewise, on a standard benchmark for image denoising, we find that the truncated EM approach improves on the factored variational approach. While the performance of the factored approach saturates with increasing numbers of hidden dimensions, the performance of the truncated approach improves the state-of-the-art for higher noise levels.

algorithm, experiment, sparse, (16 more...)

arXiv.org Machine Learning

1211.3589

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Lower Saxony > Oldenburg (0.04)
Europe > Germany > Berlin (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback