Since their early days, humans have had an important, often antagonistic relationship with uncertainty; we try to kill it everywhere we find it. Without an explanation for many natural phenomena, humans invented gods to explain them, and without certainty of the future, they consulted oracles. It was precisely the oracle's role to reduce uncertainty for their fellow humans, predicting their future and giving counsel according to their gods' will, and even though their accuracy left much to be desired, they were believed, for any measure of certainty is better than none. As society grew sophisticated, oracles were (not completely) displaced by empiric thought, which proved much more successful at prediction and counsel. Empiricism itself evolved into the collection of techniques we call the scientific method, which has proven to be much more effective at reducing uncertainty, and is modern society's most trustworthy way of producing predictions.

Frequentist Statistics tests whether an event (hypothesis) occurs or not. It calculates the probability of an event in the long run of the experiment. A very common flaw found in frequentist approach i.e. dependence of the result of an experiment on the number of times the experiment is repeated. Bayesian statistics is a mathematical procedure that applies probabilities to statistical problems. It provides people the tools to update their beliefs in the evidence of new data.

Probability is a measure of uncertainty. Probability applies to machine learning because in the real world, we need to make decisions with incomplete information. Hence, we need a mechanism to quantify uncertainty – which Probability provides us. Using probability, we can model elements of uncertainty such as risk in financial transactions and many other business processes. In contrast, in traditional programming, we work with deterministic problems i.e. the solution is not affected by uncertainty.

LaMont, Colin H., Wiggins, Paul A.

There are three principle paradigms of statistical inference: (i) Bayesian, (ii) information-based and (iii) frequentist inference. We describe an objective prior (the weighting or $w$-prior) which unifies objective Bayes and information-based inference. The $w$-prior is chosen to make the marginal probability an unbiased estimator of the predictive performance of the model. This definition has several other natural interpretations. From the perspective of the information content of the prior, the $w$-prior is both uniformly and maximally uninformative. The $w$-prior can also be understood to result in a uniform density of distinguishable models in parameter space. Finally we demonstrate the the $w$-prior is equivalent to the Akaike Information Criterion (AIC) for regular models in the asymptotic limit. The $w$-prior appears to be generically applicable to statistical inference and is free of {\it ad hoc} regularization. The mechanism for suppressing complexity is analogous to AIC: model complexity reduces model predictivity. We expect this new objective-Bayes approach to inference to be widely-applicable to machine-learning problems including singular models.