AITopics | Uncertainty

Collaborating Authors

Uncertainty

"AI systems–like people–must often act despite partial and uncertain information. First, the information received may be unreliable (e.g., a patient may mis-remember when a disease started, or may not have noticed a symptom that is important to a diagnosis). In addition, rules connecting real-world events can never include all the factors that might determine whether their conclusions really apply (e.g., the correctness of basing a diagnosis on a lab test depends whether there were conditions that might have caused a false positive, on the test being done correctly, on the results being associated with the right patient, etc.) Thus in order to draw useful conclusions, AI systems must be able to reason about the probability of events, given their current knowledge."
– from David Leake, Reasoning Under Uncertainty

News Overviews Instructional Materials AI-Alerts Classics

Bayesian CP Factorization of Incomplete Tensors with Automatic Rank Determination

Zhao, Qibin, Zhang, Liqing, Cichocki, Andrzej

arXiv.org Machine LearningOct-9-2014

--CANDECOMP/P ARAFAC (CP) tensor factorization of incomplete data is a powerful technique for tensor completion through explicitly capturing the multilinear latent factors. The existing CP algorithms require the tensor rank to be manually specified, however, the determination of tensor rank remains a challenging problem especially for CP rank . In addition, existing approaches do not take into account uncertainty information of latent factors, as well as missing entries. T o address these issues, we formulate CP factorization using a hierarchical probabilistic model and employ a fully Bayesian treatment by incorporating a sparsity-inducing prior over multiple latent factors and the appropriate hyperpriors over all hyperparameters, resulting in automatic rank determination. T o learn the model, we develop an efficient deterministic Bayesian inference algorithm, which scales linearly with data size. Our method is characterized as a tuning parameter-free approach, which can effectively infer underlying multilinear factors with a low-rank constraint, while also providing predictive distributions over missing entries. Extensive simulations on synthetic data illustrate the intrinsic capability of our method to recover the ground-truth of CP rank and prevent the overfitting problem, even when a large amount of entries are missing. Moreover, the results from real-world applications, including image inpainting and facial image synthesis, demonstrate that our method outperforms state-of-the-art approaches for both tensor factorization and tensor completion in terms of predictive performance. For instance, a video sequence can be represented by a third-order tensor with dimensionality of height width time; an image ensemble measured under multiple conditions can be represented by a higher order tensor with dimensionality ofpixel person pose illumination . T ensor factorization enables us to explicitly take into account the structure information by effectively capturing the multilinear interactions among multiple latent factors. Therefore, its theory and algorithms have been an active area of study during the past decade (see e.g., [1], [2]), and have been successfully applied to various application fields, such as face recognition, social network analysis, image and video completion, and brain signal processing. This issue has attracted a great deal of research interest in tensor completion in recent years. The objective of tensor factorization of incomplete data is to capture the underlying multilinear factors from only partially observed entries, which can in turn predict the missing entries.

artificial intelligence, machine learning, tensor, (20 more...)

arXiv.org Machine Learning

doi: 10.1109/TPAMI.2015.2392756

1401.6497

Country: Asia > China (0.28)

Genre:

Research Report > Promising Solution (0.48)
Overview > Innovation (0.34)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Projecting Ising Model Parameters for Fast Mixing

Domke, Justin, Liu, Xianghang

arXiv.org Machine LearningOct-8-2014

Inference in general Ising models is difficult, due to high treewidth making tree-based algorithms intractable. Moreover, when interactions are strong, Gibbs sampling may take exponential time to converge to the stationary distribution. We present an algorithm to project Ising model parameters onto a parameter set that is guaranteed to be fast mixing, under several divergences. We find that Gibbs sampling using the projected parameters is more accurate than with the original parameters when interaction strengths are strong and when limited time is available for sampling.

artificial intelligence, machine learning, original parameter euclidean piecewise kl, (14 more...)

arXiv.org Machine Learning

1407.0749

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.56)

Add feedback

Sequential Monte Carlo for Graphical Models

Naesseth, Christian A., Lindsten, Fredrik, Schön, Thomas B.

arXiv.org Machine LearningOct-6-2014

We propose a new framework for how to use sequential Monte Carlo (SMC) algorithms for inference in probabilistic graphical models (PGM). Via a sequential decomposition of the PGM we find a sequence of auxiliary distributions defined on a monotonically increasing sequence of probability spaces. By targeting these auxiliary distributions using SMC we are able to approximate the full joint distribution defined by the PGM. One of the key merits of the SMC sampler is that it provides an unbiased estimate of the partition function of the model. We also show how it can be used within a particle Markov chain Monte Carlo framework in order to construct high-dimensional block-sampling algorithms for general PGMs.

artificial intelligence, machine learning, sampler, (17 more...)

arXiv.org Machine Learning

1402.033

Country:

North America (0.46)
Oceania > Australia (0.28)
Europe > United Kingdom > England (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)

Add feedback

Gamma Processes, Stick-Breaking, and Variational Inference

Roychowdhury, Anirban, Kulis, Brian

arXiv.org Artificial IntelligenceOct-4-2014

While most Bayesian nonparametric models in machine learning have focused on the Dirichlet process, the beta process, or their variants, the gamma process has recently emerged as a useful nonparametric prior in its own right. Current inference schemes for models involving the gamma process are restricted to MCMC-based methods, which limits their scalability. In this paper, we present a variational inference framework for models involving gamma process priors. Our approach is based on a novel stick-breaking constructive definition of the gamma process. We prove correctness of this stick-breaking process by using the characterization of the gamma process as a completely random measure (CRM), and we explicitly derive the rate measure of our construction using Poisson process machinery. We also derive error bounds on the truncation of the infinite process required for variational inference, similar to the truncation analyses for other nonparametric models based on the Dirichlet and beta processes. Our representation is then used to derive a variational inference algorithm for a particular Bayesian nonparametric latent structure formulation known as the infinite Gamma-Poisson model, where the latent variables are drawn from a gamma process prior with Poisson likelihoods. Finally, we present results for our algorithms on nonnegative matrix factorization tasks on document corpora, and show that we compare favorably to both sampling-based techniques and variational approaches based on beta-Bernoulli priors.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Artificial Intelligence

1410.1068

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

Mapping Energy Landscapes of Non-Convex Learning Problems

Pavlovskaia, Maria, Tu, Kewei, Zhu, Song-Chun

arXiv.org Machine LearningOct-2-2014

In many statistical learning problems, the target functions to be optimized are highly non-convex in various model spaces and thus are difficult to analyze. In this paper, we compute Energy Landscape Maps (ELMs) which characterize and visualize an energy function with a tree structure, in which each leaf node represents a local minimum and each non-leaf node represents the barrier between adjacent energy basins. The ELM also associates each node with the estimated probability mass and volume for the corresponding energy basin. We construct ELMs by adopting the generalized Wang-Landau algorithm and multidomain sampler that simulates a Markov chain traversing the model space by dynamically reweighting the energy function. We construct ELMs in the model space for two classic statistical learning problems: i) clustering with Gaussian mixture models or Bernoulli templates; and ii) bi-clustering. We propose a way to measure the difficulties (or complexity) of these learning problems and study how various conditions affect the landscape complexity, such as separability of the clusters, the number of examples, and the level of supervision; and we also visualize the behaviors of different algorithms, such as K-mean, EM, two-step EM and Swendsen-Wang cuts, in the energy landscapes. Key words and phrases: Non-convex Optimization, Visualization, Clustering, Bi-clustering, Markov chain Monte Carlo. 1. INTRODUCTION In many statistical learning problems, the energy functions to be optimized are highly non-convex.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

1410.0576

Country: North America > United States > California > Los Angeles County > Los Angeles (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Education > Focused Education > Special Education (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Data Imputation through the Identification of Local Anomalies

Ozkan, Huseyin, Pelvan, Ozgun S., Kozat, Suleyman S.

arXiv.org Machine LearningSep-30-2014

We introduce a comprehensive and statistical framework in a model free setting for a complete treatment of localized data corruptions due to severe noise sources, e.g., an occluder in the case of a visual recording. Within this framework, we propose i) a novel algorithm to efficiently separate, i.e., detect and localize, possible corruptions from a given suspicious data instance and ii) a Maximum A Posteriori (MAP) estimator to impute the corrupted data. As a generalization to Euclidean distance, we also propose a novel distance measure, which is based on the ranked deviations among the data attributes and empirically shown to be superior in separating the corruptions. Our algorithm first splits the suspicious instance into parts through a binary partitioning tree in the space of data attributes and iteratively tests those parts to detect local anomalies using the nominal statistics extracted from an uncorrupted (clean) reference data set. Once each part is labeled as anomalous vs normal, the corresponding binary patterns over this tree that characterize corruptions are identified and the affected attributes are imputed. Under a certain conditional independency structure assumed for the binary patterns, we analytically show that the false alarm rate of the introduced algorithm in detecting the corruptions is independent of the data and can be directly set without any parameter tuning. The proposed framework is tested over several well-known machine learning data sets with synthetically generated corruptions; and experimentally shown to produce remarkable improvements in terms of classification purposes with strong corruption separation capabilities. Our experiments also indicate that the proposed algorithms outperform the typical approaches and are robust to varying training phase conditions.

artificial intelligence, corruption, machine learning, (15 more...)

arXiv.org Machine Learning

1409.8576

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

Distributed Variational Inference in Sparse Gaussian Process Regression and Latent Variable Models

Gal, Yarin, van der Wilk, Mark, Rasmussen, Carl E.

arXiv.org Machine LearningSep-29-2014

Gaussian processes (GPs) are a powerful tool for probabilistic inference over functions. They have been applied to both regression and non-linear dimensionality reduction, and offer desirable properties such as uncertainty estimates, robustness to over-fitting, and principled ways for tuning hyper-parameters. However the scalability of these models to big datasets remains an active topic of research. We introduce a novel re-parametrisation of variational inference for sparse GP regression and latent variable models that allows for an efficient distributed algorithm. This is done by exploiting the decoupling of the data given the inducing points to re-formulate the evidence lower bound in a Map-Reduce setting. We show that the inference scales well with data and computational resources, while preserving a balanced distribution of the load among the nodes. We further demonstrate the utility in scaling Gaussian processes to big data. We show that GP performance improves with increasing amounts of data in regression (on flight data with 2 million records) and latent variable modelling (on MNIST). The results show that GPs perform better than many common models often used for big data.

dataset, inference, regression, (15 more...)

arXiv.org Machine Learning

1402.1389

Country:

South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.77)

Add feedback

A Bayesian Tensor Factorization Model via Variational Inference for Link Prediction

Ermis, Beyza, Cemgil, A. Taylan

arXiv.org Machine LearningSep-29-2014

Probabilistic approaches for tensor factorization aim to extract meaningful structure from incomplete data by postulating low rank constraints. Recently, variational Bayesian (VB) inference techniques have successfully been applied to large scale models. This paper presents full Bayesian inference via VB on both single and coupled tensor factorization models. Our method can be run even for very large models and is easily implemented. It exhibits better prediction performance than existing approaches based on maximum likelihood on several real-world datasets for missing link prediction problem.

artificial intelligence, bayesian inference, machine learning, (13 more...)

arXiv.org Machine Learning

1409.8276

Country:

North America > United States (0.28)
Asia > Middle East > Republic of Türkiye (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(2 more...)

Add feedback

The automatic creation of concept maps from documents written using morphologically rich languages

Zubrinic, Krunoslav, Kalpic, Damir, Milicevic, Mario

arXiv.org Artificial IntelligenceSep-27-2014

Concept map is a graphical tool for representing knowledge. They have been used in many different areas, including education, knowledge management, business and intelligence. Constructing of concept maps manually can be a complex task; an unskilled person may encounter difficulties in determining and positioning concepts relevant to the problem area. An application that recommends concept candidates and their position in a concept map can significantly help the user in that situation. This paper gives an overview of different approaches to automatic and semi-automatic creation of concept maps from textual and non-textual sources. The concept map mining process is defined, and one method suitable for the creation of concept maps from unstructured textual sources in highly inflected languages such as the Croatian language is described in detail. Proposed method uses statistical and data mining techniques enriched with linguistic tools. With minor adjustments, that method can also be used for concept map mining from textual sources in other morphologically rich languages.

artificial intelligence, concept map, machine learning, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.eswa.2012.04.065

1210.7599

Country:

North America > United States (1.00)
Europe (1.00)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Education (1.00)
Health & Medicine (0.67)
Materials > Metals & Mining (0.54)
Media > News (0.46)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Order-invariant prior specification in Bayesian factor analysis

Leung, Dennis, Drton, Mathias

arXiv.org Machine LearningSep-26-2014

In (exploratory) factor analysis, the loading matrix is identified only up to orthogonal rotation. For identifiability, one thus often takes the loading matrix to be lower triangular with positive diagonal entries. In Bayesian inference, a standard practice is then to specify a prior under which the loadings are independent, the off-diagonal loadings are normally distributed, and the diagonal loadings follow a truncated normal distribution. This prior specification, however, depends in an important way on how the variables and associated rows of the loading matrix are ordered. We show how a minor modification of the approach allows one to compute with the identifiable lower triangular loading matrix but maintain invariance properties under reordering of the variables.

artificial intelligence, factor analysis, machine learning, (17 more...)

arXiv.org Machine Learning

1409.7672

Country: North America > United States > Washington > King County > Seattle (0.15)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback