Last summer, I was at a conference having lunch with Hal Daume III when we got to talking about how "Bayesian" can be a funny and ambiguous term. It seems like the definition should be straightforward: "following the work of English mathematician Rev. Thomas Bayes," perhaps, or even "uses Bayes' theorem." But many methods bearing the reverend's name or using his theorem aren't even considered "Bayesian" by his most religious followers. Why is it that Bayesian networks, for example, aren't considered… y'know… Bayesian? As I've read more outside the fields of machine learning and natural language processing -- from psychometrics and environmental biology to hackers who dabble in data science -- I've noticed three broad uses of the term "Bayesian."

Last summer, I was at a conference having lunch with Hal Daume III when we got to talking about how "Bayesian" can be a funny and ambiguous term. It seems like the definition should be straightforward: "following the work of English mathematician Rev. Thomas Bayes," perhaps, or even "uses Bayes' theorem." But many methods bearing the reverend's name or using his theorem aren't even considered "Bayesian" by his most religious followers. Why is it that Bayesian networks, for example, aren't considered… y'know… Bayesian? As I've read more outside the fields of machine learning and natural language processing -- from psychometrics and environmental biology to hackers who dabble in data science -- I've noticed three distinct uses of the term "Bayesian."

I've spent the last few months preparing for and applying for data science jobs. It's possible the data science world may reject me and my lack of both experience and a credential above a bachelors degree, in which case I'll do something else. Regardless of what lies in store for my future, I think I've gotten a good grasp of the mindset underlying machine learning and how it differs from traditional statistics, so I thought I'd write about it for those who have a similar background to me considering a similar move.1 This post is geared toward people who are excellent at statistics but don't really "get" machine learning and want to understand the gist of it in about 15 minutes of reading. If you have a traditional academic stats backgrounds (be it econometrics, biostatistics, psychometrics, etc.), there are two good reasons to learn more about data science: The world of data science is, in many ways, hiding in plain sight from the more academically-minded quantitative disciplines.