Paul Graham popularized the term "Bayesian Classification" (or more accurately "Naïve Bayesian Classification") after his "A Plan for Spam" article was published (http://www.paulgraham.com/spam.html). In fact, text classifiers based on naïve Bayesian and other techniques have been around for many years. Companies such as Autonomy and Interwoven incorporate machine-learning techniques to automatically classify documents of all kinds; one such machine-learning technique is naïve Bayesian text classification. Naïve Bayesian text classifiers are fast, accurate, simple, and easy to implement. In this article, I present a complete naïve Bayesian text classifier written in 100 lines of commented, nonobfuscated Perl.
Bayesian Computational Analyses with R is an introductory course on the use and implementation of Bayesian modeling using R software. The Bayesian approach is an alternative to the "frequentist" approach where one simply takes a sample of data and makes inferences about the likely parameters of the population. In contrast, the Bayesian approach uses both likelihood functions and a sample of observed data (the'prior') to estimate the most likely values and distributions for the estimated population parameters (the'posterior'). The course is useful to anyone who wishes to learn about Bayesian concepts and is suited to both novice and intermediate Bayesian students and Bayesian practitioners. It is both a practical, "hands-on" course with many examples using R scripts and software, and is conceptual, as the course explains the Bayesian concepts. All materials, software, R scripts, slides, exercises and solutions are included with the course materials. It is helpful to have some grounding in basic inferential statistics and probability theory. No experience with R is necessary, although it is also helpful.
Data scientist Stefano Cosentino observed in a post that the Bayesian approach leans more towards the distributions associated with each parameter. For instance, he writes that the two parameters depicted below, as shown by the Gaussian curves after a trained Bayesian network has converged. Hence the Bayesian approach, where the parameters are unknown quantities can be considered as random variables. University of Buffalo's paper defines the Bayesian approach to uncertainty, which treats all uncertain quantities as random variables and uses the laws of probability to manipulate those uncertain quantities. Hence, the right Bayesian approach integrates over all uncertain quantities rather than optimise them, states the paper.
To develop a "defendable and defensible" Bayesian learning model, we have to go beyond blindly'turning the crank' based on a "go-as-you-like" [approximate guess] prior. A lackluster attitude towards prior modeling could lead to disastrous inference, impacting various fields from clinical drug development to presidential election forecasts. The real questions are: How can we uncover the blind spots of the conventional wisdom-based prior? How can we develop the science of prior model-building that combines both data and science [DS-prior] in a testable manner – a double-yolk Bayesian egg? Unfortunately, these questions are outside the scope of business-as-usual Bayesian modus operandi and require new ideas.
Learning Bayesian networks from raw data can help provide insights into the relationships between variables. While real data often contains a mixture of discrete and continuous-valued variables, many Bayesian network structure learning algorithms assume all random variables are discrete. Thus, continuous variables are often discretized when learning a Bayesian network. However, the choice of discretization policy has significant impact on the accuracy, speed, and interpretability of the resulting models. This paper introduces a principled Bayesian discretization method for continuous variables in Bayesian networks with quadratic complexity instead of the cubic complexity of other standard techniques. Empirical demonstrations show that the proposed method is superior to the established minimum description length algorithm. In addition, this paper shows how to incorporate existing methods into the structure learning process to discretize all continuous variables and simultaneously learn Bayesian network structures.