Bayesian Inference

Basics of Bayesian Decision Theory


The use of formal statistical methods to analyse quantitative data in data science has increased considerably over the last few years. One such approach, Bayesian Decision Theory (BDT), also known as Bayesian Hypothesis Testing and Bayesian inference, is a fundamental statistical approach that quantifies the tradeoffs between various decisions using distributions and costs that accompany such decisions. In pattern recognition it is used for designing classifiers making the assumption that the problem is posed in probabilistic terms, and that all of the relevant probability values are known. Generally, we don't have such perfect information but it is a good place to start when studying machine learning, statistical inference, and detection theory in signal processing. BDT also has many applications in science, engineering, and medicine.

Naive Bayes Classification with Sklearn – Sicara Agile Big Data Development


This tutorial details Naive Bayes classifier algorithm, its principle, pros & cons, and provides an example of a Naive Gaussian classifier using the Sklearn python Library. Let's try to make a prediction of survival using passenger ticket fare information. Imagine you take a random sample of 500 passengers. In this sample, 30% of people survived. Among passenger who survived, the fare ticket mean is 100$.

Deep Bayesian Neural Networks. – Stefano Cosentino – Medium


Conventional neural networks aren't well designed to model the uncertainty associated with the predictions they make. For that, one way is to go full Bayesian. What are we trying to do? The conventional (non-Bayesian) way is to learn only the optimal values via maximum likelihood estimation. On the other hand, a Bayesian approach is interested in the distributions associated with each parameter.

How Bayesian Networks Are Superior in Understanding Effects of Variables


Bayes Nets (or Bayesian Networks) give remarkable results in determining the effects of many variables on an outcome. They typically perform strongly even in cases when other methods falter or fail. These networks have had relatively little use with business-related problems, although they have worked successfully for years in fields such as scientific research, public safety, aircraft guidance systems and national defense. Importantly, they often outperform regression, particularly in determining variables' effects. Regression is one of the most august multivariate methods, and among the most studied and applied.

Bayesball: Bayesian analysis of batting average – Towards Data Science


One of the topics in data science or statistics I found interesting, but having difficulty understanding is Bayesian analysis. During the course of my General Assembly's Data Science Immersive boot camp, I have had a chance to explore Bayesian statistics, but I really think I need some review and reinforcement. This is my personal endeavour to have a better understanding of Bayesian thinking, and how it can be applied to real-life cases. For this post, I am mainly inspired by a Youtube series by Rasmus Bååth, "Introduction to Bayesian data analysis". He is really good at giving you an intuitive understanding of Bayesian analysis, not by bombarding you with all the complicated formulas, but by providing you with a thought-process of Bayesian statistics. The topic I chose for this post is baseball.

What is the difference between Markov chain approximation and variational approximation?


PageRank and RBMs are not Markov chain approximations, rather they use Markov chains in their implementation. Similarly, LDA (Latent Dirichlet Allocation) is a generative probabilistic model (aka Bayesian hierarchical model) and not a variational approximation. LDA may use variational approximation methods for inference. Let me take the LDA model as an example. In LDA, a complicated generative model is constructed to learn the topic allocation probabilities of different documents.

An intro to Reinforcement Learning (with otters) – Monica Dinculescu


Before I wrote the JavaScripts, I got a master's in AI (almost a decade ago), and wrote a thesis on a weird and new area in Reinforcement Learning. Or at least it was new then. With all the hype around Machine Learning and Deep Learning, I thought it would be neat if I wrote a little primer on what Reinforcement Learning really means, and why it's different than just another neural net.

A Tour of The Top 10 Algorithms for Machine Learning Newbies


In machine learning, there's something called the "No Free Lunch" theorem. In a nutshell, it states that no one algorithm works best for every problem, and it's especially relevant for supervised learning (i.e.

Machine Learning for Beginners, Part 8 – Support Vector Machine


In a February 6 blog, I discussed the unsupervised machine learning Naive Bayes algorithm with an example that was hopefully easy to understand for beginners. During the summer of 2017, I began a five-part series on types of machine learning.

Human-in-the-Loop Synthesis for Partially Observable Markov Decision Processes Artificial Intelligence

We study planning problems where autonomous agents operate inside environments that are subject to uncertainties and not fully observable. Partially observable Markov decision processes (POMDPs) are a natural formal model to capture such problems. Because of the potentially huge or even infinite belief space in POMDPs, synthesis with safety guarantees is, in general, computationally intractable. We propose an approach that aims to circumvent this difficulty: in scenarios that can be partially or fully simulated in a virtual environment, we actively integrate a human user to control an agent. While the user repeatedly tries to safely guide the agent in the simulation, we collect data from the human input. Via behavior cloning, we translate the data into a strategy for the POMDP. The strategy resolves all nondeterminism and non-observability of the POMDP, resulting in a discrete-time Markov chain (MC). The efficient verification of this MC gives quantitative insights into the quality of the inferred human strategy by proving or disproving given system specifications. For the case that the quality of the strategy is not sufficient, we propose a refinement method using counterexamples presented to the human. Experiments show that by including humans into the POMDP verification loop we improve the state of the art by orders of magnitude in terms of scalability.