AITopics | Uncertainty

Collaborating Authors

Uncertainty

"AI systems–like people–must often act despite partial and uncertain information. First, the information received may be unreliable (e.g., a patient may mis-remember when a disease started, or may not have noticed a symptom that is important to a diagnosis). In addition, rules connecting real-world events can never include all the factors that might determine whether their conclusions really apply (e.g., the correctness of basing a diagnosis on a lab test depends whether there were conditions that might have caused a false positive, on the test being done correctly, on the results being associated with the right patient, etc.) Thus in order to draw useful conclusions, AI systems must be able to reason about the probability of events, given their current knowledge."
– from David Leake, Reasoning Under Uncertainty

News Overviews Instructional Materials AI-Alerts Classics

Bayesian Statistics Explained in Simple English For Beginners

#artificialintelligenceJun-21-2017, 19:45:18 GMT

Bayesian Statistics continues to remain incomprehensible in the ignited minds of many analysts. Being amazed by the incredible power of machine learning, a lot of us have become unfaithful to statistics. Our focus has narrowed down to exploring machine learning. We fail to understand that machine learning is only one way to solve real world problems. In several situations, it does not help us solve business problems, even though there is data involved in these problems. To say the least, knowledge of statistics will allow you to work on complex analytical problems, irrespective of the size of data. In 1770s, Thomas Bayes introduced'Bayes Theorem'.

artificial intelligence, bayesian inference, machine learning, (17 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

A new kernel-based approach to system identification with quantized output data

Bottegal, Giulio, Hjalmarsson, Håkan, Pillonetto, Gianluigi

arXiv.org Machine LearningJun-20-2017

In this paper we introduce a novel method for linear system identification with quantized output data. We model the impulse response as a zero-mean Gaussian process whose covariance (kernel) is given by the recently proposed stable spline kernel, which encodes information on regularity and exponential stability. This serves as a starting point to cast our system identification problem into a Bayesian framework. We employ Markov Chain Monte Carlo methods to provide an estimate of the system. In particular, we design two methods based on the so-called Gibbs sampler that allow also to estimate the kernel hyperparameters by marginal likelihood maximization via the expectation-maximization method. Numerical simulations show the effectiveness of the proposed scheme, as compared to the state-of-the-art kernel-based methods when these are employed in system identification with quantized data.

artificial intelligence, identification, machine learning, (17 more...)

arXiv.org Machine Learning

1610.0047

Country:

Europe (1.00)
North America > United States (0.28)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Bayesian Basics, Explained

@machinelearnbotJun-19-2017, 21:00:06 GMT

Editor's note: The following is an interview with Columbia University Professor Andrew Gelman conducted by Marketing scientist Kevin Gray, in which Gelman spells out the ABCs of Bayesian statistics. Kevin Gray: Most marketing researchers have heard of Bayesian statistics but know little about it. Can you briefly explain in layperson's terms what it is and how it differs from the'ordinary' statistics most of us learned in college? Andrew Gelman: Bayesian statistics uses the mathematical rules of probability to combines data with "prior information" to give inferences which (if the model being used is correct) are more precise than would be obtained by either source of information alone. Classical statistical methods avoid prior distributions.

artificial intelligence, bayesian method, machine learning, (17 more...)

@machinelearnbot

Genre: Personal > Interview (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Add feedback

GANGogh: Creating Art with GANs – Towards Data Science – Medium

@machinelearnbotJun-19-2017, 19:15:14 GMT

The work here presented is the result of a semester long independent research performed by Kenny Jones and Derrick Bonafilia (both Williams College 2017) under the guidance of Professor Andrea Danyluk. Kenny and Derrick are both heading to Facebook next year as Software Engineers and hope to continue studying GANs in whatever capacity is available to them. Generative Adversarial Networks (GANS) were introduced by Ian Goodfellow et. GANs address the lack of relative success of deep generative models compared to deep discriminative models. The authors cite the intractable nature of the maximum likelihood estimation that is necessary for most generative models as the reason for this discrepancy.

Add feedback

Bayesian inference on random simple graphs with power law degree distributions

Lee, Juho, Heaukulani, Creighton, Ghahramani, Zoubin, James, Lancelot F., Choi, Seungjin

arXiv.org Machine LearningJun-18-2017

We present a model for random simple graphs with a degree distribution that obeys a power law (i.e., is heavy-tailed). To attain this behavior, the edge probabilities in the graph are constructed from Bertoin-Fujita-Roynette-Yor (BFRY) random variables, which have been recently utilized in Bayesian statistics for the construction of power law models in several applications. Our construction readily extends to capture the structure of latent factors, similarly to stochastic blockmodels, while maintaining its power law degree distribution. The BFRY random variables are well approximated by gamma random variables in a variational Bayesian inference routine, which we apply to several network datasets for which power law degree distributions are a natural assumption. By learning the parameters of the BFRY distribution via probabilistic inference, we are able to automatically select the appropriate power law behavior from the data. In order to further scale our inference procedure, we adopt stochastic gradient ascent routines where the gradients are computed on minibatches (i.e., subsets) of the edges in the graph.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

1702.08239

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.46)
North America > United States > California (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Law (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.84)

Add feedback

Provably Optimal Algorithms for Generalized Linear Contextual Bandits

Li, Lihong, Lu, Yu, Zhou, Dengyong

arXiv.org Artificial IntelligenceJun-18-2017

Contextual bandits are widely used in Internet services from news recommendation to advertising, and to Web search. Generalized linear models (logistical regression in particular) have demonstrated stronger performance than linear models in many applications where rewards are binary. However, most theoretical analyses on contextual bandits so far are on linear bandits. In this work, we propose an upper confidence bound based algorithm for generalized linear contextual bandits, which achieves an $\tilde{O}(\sqrt{dT})$ regret over $T$ rounds with $d$ dimensional feature vectors. This regret matches the minimax lower bound, up to logarithmic terms, and improves on the best previous result by a $\sqrt{d}$ factor, assuming the number of arms is fixed. A key component in our analysis is to establish a new, sharp finite-sample confidence bound for maximum-likelihood estimates in generalized linear models, which may be of independent interest. We also analyze a simpler upper confidence bound algorithm, which is useful in practice, and prove it to have optimal regret for certain cases.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Artificial Intelligence

1703.00048

Country: North America > United States (0.46)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.30)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Data Science > Data Mining > Big Data (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

will wolf

#artificialintelligenceJun-17-2017, 09:16:01 GMT

Bayesian probabilistic models provide a nimble and expressive framework for modeling "small-world" data. In contrast, deep learning offers a more rigid yet much more powerful framework for modeling data of massive size. Edward is a probabilistic programming library that bridges this gap: "black-box" variational inference enables us to fit extremely flexible Bayesian models to large-scale data. Furthermore, these models themselves may take advantage of classic deep-learning architectures of arbitrary complexity. Edward uses TensorFlow for symbolic gradients and data flow graphs.

Add feedback

Bayesian Conditional Generative Adverserial Networks

Abbasnejad, M. Ehsan, Shi, Qinfeng, Abbasnejad, Iman, Hengel, Anton van den, Dick, Anthony

arXiv.org Machine LearningJun-17-2017

Traditional GANs use a deterministic generator function (typically a neural network) to transform a random noise input $z$ to a sample $\mathbf{x}$ that the discriminator seeks to distinguish. We propose a new GAN called Bayesian Conditional Generative Adversarial Networks (BC-GANs) that use a random generator function to transform a deterministic input $y'$ to a sample $\mathbf{x}$. Our BC-GANs extend traditional GANs to a Bayesian framework, and naturally handle unsupervised learning, supervised learning, and semi-supervised learning problems. Experiments show that the proposed BC-GANs outperforms the state-of-the-arts.

artificial intelligence, discriminator, machine learning, (18 more...)

arXiv.org Machine Learning

1706.05477

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback

Tensor SVD: Statistical and Computational Limits

Zhang, Anru, Xia, Dong

arXiv.org Machine LearningJun-17-2017

In this paper, we propose a general framework for tensor singular value decomposition (tensor SVD), which focuses on the methodology and theory for extracting the hidden low-rank structure from high-dimensional tensor data. Comprehensive results are developed on both the statistical and computational limits for tensor SVD. This problem exhibits three different phases according to the signal-noise-ratio (SNR). In particular, with strong SNR, we show that the classical higher order orthogonal iteration achieves the minimax optimal rate of convergence in estimation; with weak SNR, the information-theoretical lower bound implies that it is impossible to have consistent estimation in general; with moderate SNR, we show that the non-convex maximum likelihood estimation provides optimal solution, but with NP-hard computational cost; moreover, under the hardness hypothesis of hypergraphic planted clique detection, there are no polynomial-time algorithms performing consistently in general.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

1703.02724

Country: North America > United States > Wisconsin (0.28)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)

Add feedback

Matching While Learning

Johari, Ramesh, Kamble, Vijay, Kanoria, Yash

arXiv.org Machine LearningJun-17-2017

We consider the problem faced by a service platform that needs to match supply with demand, but also to learn attributes of new arrivals in order to match them better in the future. We introduce a benchmark model with heterogeneous workers and jobs that arrive over time. Job types are known to the platform, but worker types are unknown and must be learned by observing match outcomes. Workers depart after performing a certain number of jobs. The payoff from a match depends on the pair of types and the goal is to maximize the steady-state rate of accumulation of payoff. Our main contribution is a complete characterization of the structure of the optimal policy in the limit that each worker performs many jobs. The platform faces a trade-off for each worker between myopically maximizing payoffs (exploitation) and learning the type of the worker (\emph{exploration}). This creates a multitude of multi-armed bandit problems, one for each worker, coupled together by the constraint on the availability of jobs of different types (capacity constraints). We find that the platform should estimate a shadow price for each job type, and use the payoffs adjusted by these prices, first, to determine its learning goals and then, for each worker, (i) to balance learning with payoffs during the "exploration phase", and (ii) to myopically match after it has achieved its learning goals during the "exploitation phase."

data mining, machine learning, worker type, (19 more...)

arXiv.org Machine Learning

1603.04549

Genre: Research Report (0.81)

Industry: Banking & Finance > Economy (0.34)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback