Bayesian Learning
Artificial Intelligence: Structures and Strategies for Complex Problem Solving
Many and long were the conversations between Lord Byron and Shelley to which I was a devout and silent listener. During one of these, various philosophical doctrines were discussed, and among others the nature of the principle of life, and whether there was any probability of its ever being discovered and communicated. They talked of the experiments of Dr. Darwin (I speak not of what the doctor really did or said that he did, but, as more to my purpose, of what was then spoken of as having been done by him), who preserved a piece of vermicelli in a glass case till by some extraordinary means it began to move with a voluntary motion. Not thus, after all, would life be given. Perhaps a corpse would be reanimated; galvanism had given token of such things: perhaps the component parts of a creature might be manufactured, brought together, and endued with vital warmth (Butler 1998).
David Poole - Probabilistic Research
This page contains some information on research by David Poole and students on probabilistic reasoning and decision making. It is not intended to be an introduction to the vast literature on these topics, but only the incremental work done by me. For more different perspectives, see the pointers from the Uncertainty in AI (UAI) home page. Maybe someday I will write an online introduction. Probabilistic Horn abduction is a pragmatic combination of logic and probability.
Abduction (Stanford Encyclopedia of Philosophy)
You happen to know that Tim and Harry have recently had a terrible row that ended their friendship. Now someone tells you that she just saw Tim and Harry jogging together. The best explanation for this that you can think of is that they made up. You conclude that they are friends again. One morning you enter the kitchen to find a plate and cup on the table, with breadcrumbs and a pat of butter on it, and surrounded by a jar of jam, a pack of sugar, and an empty carton of milk. You conclude that one of your house-mates got up at night to make him- or herself a midnight snack and was too tired to clear the table. This, you think, best explains the scene you are facing. To be sure, it might be that someone burgled the house and took the time to have a bite while on the job, or a house-mate might have arranged the things on the table without having a midnight snack but just to make you believe that someone had a midnight snack. But these hypotheses strike you as providing much more contrived explanations of the data than the one you infer to. Walking along the beach, you see what looks like a picture of Winston Churchill in the sand. It could be that, as in the opening pages of Hilary Putnam's (1981), what you see is actually the trace of an ant crawling on the beach. The much simpler, and therefore (you think) much better, explanation is that someone intentionally drew a picture of Churchill in the sand. That, in any case, is what you come away believing. In these examples, the conclusions do not follow logically from the premises.
A symbolic algebra for the computation of expected utilities in multiplicative influence diagrams
Leonelli, Manuele, Riccomagno, Eva, Smith, Jim Q.
Influence diagrams provide a compact graphical representation of decision problems. Several algorithms for the quick computation of their associated expected utilities are available in the literature. However, often they rely on a full quantification of both probabilistic uncertainties and utility values. For problems where all random variables and decision spaces are finite and discrete, here we develop a symbolic way to calculate the expected utilities of influence diagrams that does not require a full numerical representation. Within this approach expected utilities correspond to families of polynomials. After characterizing their polynomial structure, we develop an efficient symbolic algorithm for the propagation of expected utilities through the diagram and provide an implementation of this algorithm using a computer algebra system. We then characterize many of the standard manipulations of influence diagrams as transformations of polynomials. We also generalize the decision analytic framework of these diagrams by defining asymmetries as operations over the expected utility polynomials.
Polymorphic Malware Detection Using Sequence Classification Methods
A pdf version of this document created using latex can be downloaded by clicking here. Polymorphic malware detection is challenging due to the continual mutations miscreants introduce to successive instances of a particular virus. Such changes are akin to mutations in biological sequences. Recently, high-throughput methods for gene sequence classification have been developed by the bioinformatics and computational biology communities. In this paper, we argue that these methods can be usefully applied to malware detection. Unfortunately, gene classification tools are usually optimized for and restricted to an alphabet of four letters (nucleic acids). Consequently, we have selected the Strand gene sequence classifier, which offers a robust classification strategy that can easily accommodate unstructured data with any alphabet including source code or compiled machine code. To demonstrate Stand's suitability for classifying malware, we execute it on approximately 500GB of malware data provided by the Kaggle Microsoft Malware Classification Challenge (BIG 2015) used for predicting 9 classes of polymorphic malware.
Interactive Elicitation of Knowledge on Feature Relevance Improves Predictions in Small Data Sets
Micallef, Luana, Sundin, Iiris, Marttinen, Pekka, Ammad-ud-din, Muhammad, Peltola, Tomi, Soare, Marta, Jacucci, Giulio, Kaski, Samuel
Providing accurate predictions is challenging for machine learning algorithms when the number of features is larger than the number of samples in the data. Prior knowledge can improve machine learning models by indicating relevant variables and parameter values. Yet, this prior knowledge is often tacit and only available from domain experts. We present a novel approach that uses interactive visualization to elicit the tacit prior knowledge and uses it to improve the accuracy of prediction models. The main component of our approach is a user model that models the domain expert's knowledge of the relevance of different features for a prediction task. In particular, based on the expert's earlier input, the user model guides the selection of the features on which to elicit user's knowledge next. The results of a controlled user study show that the user model significantly improves prior knowledge elicitation and prediction accuracy, when predicting the relative citation counts of scientific documents in a specific domain.
Beginners Exercise: Bayesian Computation with Stan and Farmer Jöns
Over the last two years I've occasionally been giving a very basic tutorial to Bayesian statistics using R and Stan. At the end of the tutorial I hand out an exercise for those that want to flex their newly acquired skills. I call this exercise Bayesian computation with Stan and Farmer Jöns and it's pretty cool! Now, it's not cool because of me, but because the expressiveness of Stan allowed me to write a small number of data analytic questions that quickly takes you from running a simple binomial model up to running a linear regression. Throughout the exercise you work with the same model code and each question just requires you to make a minimal change to this code, yet you will cover most models taught in a basic statistics course!
The Perceptron
Most tasks in Machine Learning can be reduced to classification tasks. For example, we have a medical dataset and we want to classify who has diabetes (positive class) and who doesn't (negative class). We have a dataset from the financial world and want to know which customers will default on their credit (positive class) and which customers will not (negative class). To do this, we can train a Classifier with a'training dataset' and after such a Classifier is trained (we have determined its model parameters) and can accurately classify the training set, we can use it to classify new data (test set). If the training is done properly, the Classifier should predict the class probabilities of the new data with a similar accuracy.