AITopics | Uncertainty

Collaborating Authors

Uncertainty

"AI systems–like people–must often act despite partial and uncertain information. First, the information received may be unreliable (e.g., a patient may mis-remember when a disease started, or may not have noticed a symptom that is important to a diagnosis). In addition, rules connecting real-world events can never include all the factors that might determine whether their conclusions really apply (e.g., the correctness of basing a diagnosis on a lab test depends whether there were conditions that might have caused a false positive, on the test being done correctly, on the results being associated with the right patient, etc.) Thus in order to draw useful conclusions, AI systems must be able to reason about the probability of events, given their current knowledge."
– from David Leake, Reasoning Under Uncertainty

News Overviews Instructional Materials AI-Alerts Classics

Tensor-on-tensor regression

Lock, Eric F.

arXiv.org Machine LearningJun-26-2017

We propose a framework for the linear prediction of a multi-way array (i.e., a tensor) from another multi-way array of arbitrary dimension, using the contracted tensor product. This framework generalizes several existing approaches, including methods to predict a scalar outcome from a tensor, a matrix from a matrix, or a tensor from a scalar. We describe an approach that exploits the multiway structure of both the predictors and the outcomes by restricting the coefficients to have reduced CP-rank. We propose a general and efficient algorithm for penalized least-squares estimation, which allows for a ridge (L_2) penalty on the coefficients. The objective is shown to give the mode of a Bayesian posterior, which motivates a Gibbs sampling algorithm for inference. We illustrate the approach with an application to facial image data. An R package is available at https://github.com/lockEF/MultiwayRegression .

artificial intelligence, machine learning, regression, (18 more...)

arXiv.org Machine Learning

1701.01037

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

AI that can shoot down fighter planes helps treat bipolar disorder: Engineering and medical researchers apply genetic fuzzy logic successfully to predict treatment outcomes for bipolar patients

#artificialintelligenceJun-24-2017, 03:40:17 GMT

The findings open a world of possibility for using AI, or machine learning, to treat disease, researchers said. David Fleck, an associate professor at the UC College of Medicine, and his co-authors used artificial intelligence called "genetic fuzzy trees" to predict how bipolar patients would respond to lithium. Bipolar disorder, depicted in the TV show "Homeland" and the Oscar-winning "Silver Linings Playbook," affects as many as six million adults in the United States or 4 percent of the adult population in a given year. "In psychiatry, treatment of bipolar disorder is as much an art as a science," Fleck said. "Patients are fluctuating between periods of mania and depression. Treatments will change during those periods. It's really difficult to treat them appropriately during stages of the illness."

artificial intelligence, bipolar disorder, fuzzy logic, (11 more...)

#artificialintelligence

Country: North America > United States (0.50)

Genre: Research Report > New Finding (0.73)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)

Technology:

Information Technology > Artificial Intelligence > Applied AI (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.47)

Add feedback

Inference of High-dimensional Autoregressive Generalized Linear Models

Hall, Eric C., Raskutti, Garvesh, Willett, Rebecca

arXiv.org Machine LearningJun-24-2017

Vector autoregressive models characterize a variety of time series in which linear combinations of current and past observations can be used to accurately predict future observations. For instance, each element of an observation vector could correspond to a different node in a network, and the parameters of an autoregressive model would correspond to the impact of the network structure on the time series evolution. Often these models are used successfully in practice to learn the structure of social, epidemiological, financial, or biological neural networks. However, little is known about statistical guarantees on estimates of such models in non-Gaussian settings. This paper addresses the inference of the autoregressive parameters and associated network structure within a generalized linear model framework that includes Poisson and Bernoulli autoregressive processes. At the heart of this analysis is a sparsity-regularized maximum likelihood estimator. While sparsity-regularization is well-studied in the statistics and machine learning communities, those analysis methods cannot be applied to autoregressive generalized linear models because of the correlations and potential heteroscedasticity inherent in the observations. Sample complexity bounds are derived using a combination of martingale concentration inequalities and modern empirical process techniques for dependent random variables. These bounds, which are supported by several simulation studies, characterize the impact of various network parameters on estimator performance.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

1605.02693

Country: North America > United States > Wisconsin (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)

Add feedback

Numbers war: How Bayesian vs frequentist statistics influence AI

#artificialintelligenceJun-23-2017, 06:05:09 GMT

If you want to develop your ML and AI skills, you will need to pick up some statistics and before you have got more than a few steps down that path you will find (whether you like it or not) that you have entered the Twilight Zone that is the frequentist/Bayesian religious war. I use the term "war" advisedly because war, by definition, has moved beyond debate and discussion. "Religious" because the war is based on belief systems, not information. The frequentist world has been briefly described here. The Bayesian world is described in what follows.

artificial intelligence, bayesian inference, machine learning, (16 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.30)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.30)

Add feedback

Polya Urn Latent Dirichlet Allocation: a doubly sparse massively parallel sampler

Terenin, Alexander, Magnusson, Måns, Jonsson, Leif, Draper, David

arXiv.org Machine LearningJun-23-2017

Latent Dirichlet Allocation (LDA) is a topic model widely used in natural language processing and machine learning. Most approaches to training the model rely on iterative algorithms, which makes it difficult to run LDA on big data sets that are best analyzed in parallel and distributed computational environments. Indeed, current approaches to parallel inference either don't converge to the correct posterior or require storage of large dense matrices in memory. We present a novel sampler that overcomes both problems, and we show that this sampler is faster, both empirically and theoretically, than previous Gibbs samplers for LDA. We do so by employing a novel Pólya-Urn-based approximation in the sparse partially collapsed sampler for LDA. We prove that the approximation error vanishes with data size, making our algorithm asymptotically exact, a property of importance for large-scale topic models. In addition, we show, via an explicit example, that - contrary to popular belief in the topic modeling literature - partially collapsed samplers can be more efficient than fully collapsed samplers. We conclude by comparing the performance of our algorithm with that of other approaches on well-known corpora. Keywords: Bayesian inference, Big Data, computational complexity, Gibbs sampling, Latent Dirichlet Allocation, Markov Chain Monte Carlo, natural language processing, parallel and distributed systems, topic models.

machine learning, natural language, sampler, (18 more...)

arXiv.org Machine Learning

1704.03581

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

Effects of Additional Data on Bayesian Clustering

Yamazaki, Keisuke

arXiv.org Machine LearningJun-23-2017

Hierarchical probabilistic models, such as mixture models, are used for cluster analysis. These models have two types of variables: observable and latent. In cluster analysis, the latent variable is estimated, and it is expected that additional information will improve the accuracy of the estimation of the latent variable. Many proposed learning methods are able to use additional data; these include semi-supervised learning and transfer learning. However, from a statistical point of view, a complex probabilistic model that encompasses both the initial and additional data might be less accurate due to having a higher-dimensional parameter. The present paper presents a theoretical analysis of the accuracy of such a model and clarifies which factor has the greatest effect on its accuracy, the advantages of obtaining additional data, and the disadvantages of increasing the complexity.

additional data, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

1607.03574

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.68)

Add feedback

Learning Discrete Bayesian Networks from Continuous Data

Chen, Yi-Chun, Wheeler, Tim A., Kochenderfer, Mykel J.

Journal of Artificial Intelligence ResearchJun-22-2017

Learning Bayesian networks from raw data can help provide insights into the relationships between variables. While real data often contains a mixture of discrete and continuous-valued variables, many Bayesian network structure learning algorithms assume all random variables are discrete. Thus, continuous variables are often discretized when learning a Bayesian network. However, the choice of discretization policy has significant impact on the accuracy, speed, and interpretability of the resulting models. This paper introduces a principled Bayesian discretization method for continuous variables in Bayesian networks with quadratic complexity instead of the cubic complexity of other standard techniques. Empirical demonstrations show that the proposed method is superior to the established minimum description length algorithm. In addition, this paper shows how to incorporate existing methods into the structure learning process to discretize all continuous variables and simultaneously learn Bayesian network structures.

bayesian network, discretization method, discretization policy, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.5371

AI Access Foundation

11063

Journal of Artificial Intelligence Research

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > Orange County > Irvine (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Efficient Approximate Solutions to Mutual Information Based Global Feature Selection

Venkateswara, Hemanth, Lade, Prasanth, Lin, Binbin, Ye, Jieping, Panchanathan, Sethuraman

arXiv.org Machine LearningJun-22-2017

Mutual Information (MI) is often used for feature selection when developing classifier models. Estimating the MI for a subset of features is often intractable. We demonstrate, that under the assumptions of conditional independence, MI between a subset of features can be expressed as the Conditional Mutual Information (CMI) between pairs of features. But selecting features with the highest CMI turns out to be a hard combinatorial problem. In this work, we have applied two unique global methods, Truncated Power Method (TPower) and Low Rank Bilinear Approximation (LowRank), to solve the feature selection problem. These algorithms provide very good approximations to the NP-hard CMI based feature selection problem. We experimentally demonstrate the effectiveness of these procedures across multiple datasets and compare them with existing MI based global and iterative feature selection procedures.

artificial intelligence, feature selection, machine learning, (15 more...)

arXiv.org Machine Learning

1706.07535

Country: North America > United States > Michigan (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Continuum Limit of Posteriors in Graph Bayesian Inverse Problems

Trillos, Nicolas Garcia, Sanz-Alonso, Daniel

arXiv.org Machine LearningJun-22-2017

We consider the problem of recovering a function input of a differential equation formulated on an unknown domain $M$. We assume to have access to a discrete domain $M_n=\{x_1, \dots, x_n\} \subset M$, and to noisy measurements of the output solution at $p\le n$ of those points. We introduce a graph-based Bayesian inverse problem, and show that the graph-posterior measures over functions in $M_n$ converge, in the large $n$ limit, to a posterior over functions in $M$ that solves a Bayesian inverse problem with known domain. The proofs rely on the variational formulation of the Bayesian update, and on a new topology for the study of convergence of measures over functions on point clouds to a measure over functions on the continuum. Our framework, techniques, and results may serve to lay the foundations of robust uncertainty quantification of graph-based tasks in machine learning. The ideas are presented in the concrete setting of recovering the initial condition of the heat equation on an unknown manifold.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

1706.07193

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Horseshoe Regularization for Feature Subset Selection

Bhadra, Anindya, Datta, Jyotishka, Polson, Nicholas G., Willard, Brandon

arXiv.org Machine LearningJun-22-2017

Feature subset selection arises in many high-dimensional applications of statistics, such as compressed sensing and genomics. The $\ell_0$ penalty is ideal for this task, the caveat being it requires the NP-hard combinatorial evaluation of all models. A recent area of considerable interest is to develop efficient algorithms to fit models with a non-convex $\ell_\gamma$ penalty for $\gamma\in (0,1)$, which results in sparser models than the convex $\ell_1$ or lasso penalty, but is harder to fit. We propose an alternative, termed the horseshoe regularization penalty for feature subset selection, and demonstrate its theoretical and computational advantages. The distinguishing feature from existing non-convex optimization approaches is a full probabilistic representation of the penalty as the negative of the logarithm of a suitable prior, which in turn enables efficient expectation-maximization and local linear approximation algorithms for optimization and MCMC for uncertainty quantification. In synthetic and real data, the resulting algorithms provide better statistical performance, and the computation requires a fraction of time of state-of-the-art non-convex solvers.

artificial intelligence, bayesian inference, machine learning, (20 more...)

arXiv.org Machine Learning

1702.074

Country: North America > United States > Arkansas (0.28)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.48)
Health & Medicine > Therapeutic Area > Oncology > Leukemia (0.46)
Health & Medicine > Therapeutic Area > Hematology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback