Collaborating Authors

bayesian inference

Regularization -- Part 2


These are the lecture notes for FAU's YouTube Lecture "Deep Learning". This is a full transcript of the lecture video & matching slides. We hope, you enjoy this as much as the videos. Of course, this transcript was created with deep learning techniques largely automatically and only minor manual modifications were performed. If you spot mistakes, please let us know!

Causal AI & Bayesian Networks


We are all familiar with the dictum that "correlation does not imply causation". Furthermore, given a data file with samples of two variables x and z, we all know how to calculate the correlation between x and z. But it's only an elite minority, the few, the proud, the Bayesian Network aficionados, that know how to calculate the causal connection between x and z. Neural Net aficionados are incapable of doing this. Their Neural nets are just too wimpy to cut it.

Generative Adversarial Networks (GANs) & Bayesian Networks


Generative Adversarial Networks (GANs) software is software for producing forgeries and imitations of data (aka synthetic data, fake data). Human beings have been making fakes, with good or evil intent, of almost everything they possibly can, since the beginning of the human race. Thus, perhaps not too surprisingly, GAN software has been widely used since it was first proposed in this amazingly recent 2014 paper. To gauge how widely GAN software has been used so far, see, for example, this 2019 article entitled "18 Impressive Applications of Generative Adversarial Networks (GANs)" Sounds (voices, music,...), Images (realistic pictures, paintings, drawings, handwriting, ...), Text,etc. The forgeries can be tweaked so that they range from being very similar to the originals, to being whimsical exaggerations thereof.

Bayes' Theorem in Layman's Terms


If you have difficulty in understanding Bayes' theorem, trust me you are not alone. In this tutorial, I'll help you to cross that bridge step by step. Let's consider Alex and Brenda are two people in your office, When you are working you saw someone walked in front of you, and you didn't notice who is she/he. Now I'll give you extra information, Let's calculate the probabilities with this new information, Probability that Alex is the person passed by is 2/5 i.e, Probability that Brenda is the person passed by is 3/5 i.e, Probabilities that we are calculated before the new information are called Prior, and probabilities that we are calculated after the new information are called Posterior. Consider a scenario where, Alex comes to the office 3 days a week, and Brenda comes to the office 1 day a week.

Maximum likelihood estimation for Machine Learning - Nucleusbox


In the Logistic Regression for Machine Learning using Python blog, I have introduced the basic idea of the logistic function. We have discussed the cost function. And in the iterative method, we focus on the Gradient descent optimization method. Now so in this section, we are going to introduce the Maximum Likelihood cost function. And we would like to maximize this cost function.

Variational Bayes In Private Settings (VIPS)

Journal of Artificial Intelligence Research

Many applications of Bayesian data analysis involve sensitive information such as personal documents or medical records, motivating methods which ensure that privacy is protected. We introduce a general privacy-preserving framework for Variational Bayes (VB), a widely used optimization-based Bayesian inference method. Our framework respects differential privacy, the gold-standard privacy criterion, and encompasses a large class of probabilistic models, called the Conjugate Exponential (CE) family. We observe that we can straightforwardly privatise VB's approximate posterior distributions for models in the CE family, by perturbing the expected sufficient statistics of the complete-data likelihood. For a broadly-used class of non-CE models, those with binomial likelihoods, we show how to bring such models into the CE family, such that inferences in the modified model resemble the private variational Bayes algorithm as closely as possible, using the Pólya-Gamma data augmentation scheme. The iterative nature of variational Bayes presents a further challenge since iterations increase the amount of noise needed. We overcome this by combining: (1) an improved composition method for differential privacy, called the moments accountant, which provides a tight bound on the privacy cost of multiple VB iterations and thus significantly decreases the amount of additive noise; and (2) the privacy amplification effect of subsampling mini-batches from large-scale data in stochastic learning. We empirically demonstrate the effectiveness of our method in CE and non-CE models including latent Dirichlet allocation, Bayesian logistic regression, and sigmoid belief networks, evaluated on real-world datasets.

Vocabulary Alignment in Openly Specified Interactions

Journal of Artificial Intelligence Research

The problem of achieving common understanding between agents that use different vocabularies has been mainly addressed by techniques that assume the existence of shared external elements, such as a meta-language or a physical environment. In this article, we consider agents that use different vocabularies and only share knowledge of how to perform a task, given by the specification of an interaction protocol. We present a framework that lets agents learn a vocabulary alignment from the experience of interacting. Unlike previous work in this direction, we use open protocols that constrain possible actions instead of defining procedures, making our approach more general. We present two techniques that can be used either to learn an alignment from scratch or to repair an existent one, and we evaluate their performance experimentally.

35 Words About Uncertainty, Every AI-Savvy Leader Must Know


Bayes' rule: (or Bayes' theorem) of one probability theory's most important rules, describing the probability of an event, based on prior knowledge of conditions that might be related:

Machine learning for causal inference: on the use of cross-fit estimators Machine Learning

Modern causal inference methods allow machine learning to be used to weaken parametric modeling assumptions. However, the use of machine learning may result in bias and incorrect inferences due to overfitting. Cross-fit estimators have been proposed to eliminate this bias and yield better statistical properties. We conducted a simulation study to assess the performance of several different estimators for the average causal effect (ACE). The data generating mechanisms for the simulated treatment and outcome included log-transforms, polynomial terms, and discontinuities. We compared singly-robust estimators (g-computation, inverse probability weighting) and doubly-robust estimators (augmented inverse probability weighting, targeted maximum likelihood estimation). Nuisance functions were estimated with parametric models and ensemble machine learning, separately. We further assessed cross-fit doubly-robust estimators. With correctly specified parametric models, all of the estimators were unbiased and confidence intervals achieved nominal coverage. When used with machine learning, the cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage. Due to the difficulty of properly specifying parametric models in high dimensional data, doubly-robust estimators with ensemble learning and cross-fitting may be the preferred approach for estimation of the ACE in most epidemiologic studies. However, these approaches may require larger sample sizes to avoid finite-sample issues.