Goto

Collaborating Authors

 pymc3


Introduction to Bayesian Modeling with PyMC3 - Dr. Juan Camilo Orduz

#artificialintelligence

We can also see this visually. We can verify the convergence of the chains formally using the Gelman Rubin test. Values close to 1.0 mean convergence. We can also test for correlation between samples in the chains. We are aiming for zero auto-correlation to get "random" samples from the posterior distribution.


Hierarchical Bayesian Neural Networks with Informative Priors

#artificialintelligence

Imagine you have a machine learning (ML) problem but only small data (gasp, yes, this does exist). This often happens when your data set is nested -- you might have many data points, but only few per category. For example, in ad-tech you may want predict how likely a user will buy a certain product. There could be thousands of products but you only have a small number of measurements (e.g. Likely, there will be similarities between each product category, but they will all have individual differences too.


How I Learned to Stop Worrying and Love Uncertainty

#artificialintelligence

Since their early days, humans have had an important, often antagonistic relationship with uncertainty; we try to kill it everywhere we find it. Without an explanation for many natural phenomena, humans invented gods to explain them, and without certainty of the future, they consulted oracles. It was precisely the oracle's role to reduce uncertainty for their fellow humans, predicting their future and giving counsel according to their gods' will, and even though their accuracy left much to be desired, they were believed, for any measure of certainty is better than none. As society grew sophisticated, oracles were (not completely) displaced by empiric thought, which proved much more successful at prediction and counsel. Empiricism itself evolved into the collection of techniques we call the scientific method, which has proven to be much more effective at reducing uncertainty, and is modern society's most trustworthy way of producing predictions.



Theano, TensorFlow and the Future of PyMC – PyMC Developers – Medium

#artificialintelligence

Since the Theano team announced that it would cease development and maintenance of Theano within a year, we, the PyMC developers, have been actively discussing what to do about this. We are very excited to announce that the new version of PyMC will use TensorFlow Probability (TFP) as its backend. TensorFlow already has a very broad user base and with TFP gained a powerful new library with elegant support for probability distributions and transformations (called bijections, see the TFP paper for a full description), as well as a layer for constructing probabilistic models, called Edward2. It is clear that TFP's focus is to provide a strong foundation upon which flexible statistical models for inference and prediction can be constructed from the ground up. Its focus is not, however, to provide a high-level API which makes construction and fitting of common classes of models easy for applied users.


Introduction to Probabilistic Machine Learning with PyMC3

#artificialintelligence

Machine Learning has gone mainstream and now powers several real world applications like autonomous vehicles at Uber & Tesla, recommendation engines on Amazon & Netflix, and much more. In this meetup, I introduced probabilistic machine learning and probabilistic programming with PyMC3. I discussed the basics of machine learning from a probabilistic/Bayesian perspective and contrasted it with traditional/algorithmic machine learning. I also discussed how to build probabilistic models in computer code using a new exciting programming paradigm called Probabilistic Programming (PP). Particularly I used PyMC3, a PP language, to build models ranging from simple generalized linear models to clustering models for machine learning.


Hello, world! Stan, PyMC3, and Edward - Statistical Modeling, Causal Inference, and Social Science

@machinelearnbot

In both Stan and Edward, the program defining a model defines a joint log density that acts as a function from data sets to concrete posterior densities. In both Stan and Edward, the language distinguishes data variables from parameter values and provides an object-level representation of data variables. In PyMC3, the data is included as simple Python types in the model objects as the graph is built. So to get a model abstract, you'd have to write a function that takes the variables as arguments then returns the model instantiated with data. The definition of the deterministic node mu here is in terms of the actual data vectors X1 and X2--these aren't placeholders, their values are used from the containing environment.


5 Machine Learning Projects You Should Not Overlook

#artificialintelligence

After a hiatus, the "Overlook..." posts are making their comeback this month, continuing the modest quest of bringing formidable, lesser-known machine learning projects to a few additional sets of eyes. Check out the 5 projects below for some potential fresh machine learning ideas. As such, a common use case is to have the fastText classifier use a single column as input, ignoring other columns. This is especially true when fastText is to be used as one of several classifiers in a stacking classifier, with other classifiers using non-textual features. Understanding fastText is the important piece of the puzzle, but once this understanding is possessed, skift helps you easily implement fastText, as well as integrate it with other Scikit-learn functionality in general.


Stan vs PyMc3 (vs Edward) – Towards Data Science

@machinelearnbot

The holy trinity when it comes to being Bayesian. I will provide my experience in using the first two packages and my high level opinion of the third (haven't used it in practice). Of course then there is the mad men (old professors who are becoming irrelevant) who actually do their own Gibbs sampling. You specify the generative model for the data. You feed in the data as observations and then it samples from the posterior of the data for you.


The Algorithms Behind Probabilistic Programming

#artificialintelligence

Morever, these algorithms are robust, so don't require problem-specific hand-tuning. One powerful example is sampling from an arbitrary probability distribution, which we need to do often (and efficiently!) when doing inference. The brute force approach, rejection sampling, is problematic because acceptance rates are low: as only a tiny fraction of attempts generate successful samples, the algorithms are slow and inefficient. See this post by Jeremey Kun for further details. Until recently, the main alternative to this naive approach was Markov Chain Monte Carlo sampling (of which Metropolis Hastings and Gibbs sampling are well-known examples). If you used Bayesian inference in the 90s or early 2000s, you may remember BUGS (and WinBUGS) or JAGS, which used these methods. These remain popular teaching tools (see e.g.