Probability for improved business decisions: Introduction, Combinatorics, Bayesian Inference, Distributions. Welcome to Probability and Statistics for Business and Data Science! In this course we cover what you need to know about probability and statistics to succeed in business and the data science field! This practical course will go over theory and implementation of statistics to real world problems. Each section has example problems, in course quizzes, and assessment tests.

Today we are going to implement a Bayesian linear regression in R from scratch and use it to forecast US GDP growth. This post is based on a very informative manual from the Bank of England on Applied Bayesian Econometrics. I have translated the original Matlab code into R since its open source and widely used in data analysis/science. My main goal in this post is to try and give people a better understanding of Bayesian statistics, some of it's advantages and also some scenarios where you might want to use it. Let's take a moment to think about why we would we even want to use Bayesian techniques in the first place.

The #52weeksofdatascience newsletter covers everything from Linear Regression to Neural Networks and beyond. So, if you like Data Science and Machine Learning, don't forget to subscribe! Main Idea: Bayesian Belief Network represents a set of variables and their conditional dependencies via a Directed Acyclic Graph (DAG) like the one displayed below. DAG allows us to determine the structure and relationship between different variables explicitly. Everyday use cases: BBN has many use cases, from helping to diagnose diseases to real-time predictions of a race outcome or advising marketing decisions.

Despite the promise of big data, inferences are often limited not by sample size but rather by systematic effects. Only by carefully modeling these effects can we take full advantage of the data--big data must be complemented with big models and the algorithms that can fit them. One such algorithm is Hamiltonian Monte Carlo, which exploits the inherent geometry of the posterior distribution to admit full Bayesian inference that scales to the complex models of practical interest. In this talk I will discuss the theoretical foundations of Hamiltonian Monte Carlo, elucidating the geometric nature of its scalable performance and stressing the properties critical to a robust implementation. The talk is this Thurs, 6 Apr, 1:10-2:20pm in 303 Mudd Building at Columbia.

Stacking is a widely used model averaging technique that asymptotically yields optimal predictions among linear averages. We show that stacking is most effective when model predictive performance is heterogeneous in inputs, and we can further improve the stacked mixture with a hierarchical model. We generalize stacking to Bayesian hierarchical stacking. The model weights are varying as a function of data, partially-pooled, and inferred using Bayesian inference. We further incorporate discrete and continuous inputs, other structured priors, and time series and longitudinal data.