Goto

Collaborating Authors

 Bayesian Learning


Regression, Logistic Regression and Maximum Entropy part 2 (code examples) – Ahmet Taspinar

#artificialintelligence

In the previous blog we have seen the theory and mathematics behind the Maximum Entropy and Logistic Regression Classifiers. Logistic Regression is one of the most powerful classification methods within machine learning and can be used for a wide variety of tasks. Think of pre-policing or predictive analytics in health; it can be used to aid tuberculosis patients, aid breast cancer diagnosis, etc. Think of modeling urban growth, analysing mortgage pre-payments and defaults, forecasting the direction and strength of stock market movement, and even sports. Reading all of this, the theory[1] of Maximum Entropy Classification might look difficult. In my experience, the average Developer does not believe they can design a proper Maximum Entropy / Logistic Regression Classifier from scratch.


Bayesian Network-Based Extension for PGP — Estimating Petition Support

AAAI Conferences

Consider the problem of estimating the expected number of distinct eligible voters among the authors of a set of electronic signatures gathered for a petition (or citizen initiative) that has to pass legally required thresholds. We formalize this problem and propose an extension to the Pretty Good Privacy Web Of Trust, a mechanism for reciprocally certifying identities between peers. The extension (a) enables agents to certify additional relevant statements about others, and (b) gives agents opportunities for negative authentication statements (e.g., on ineligibility of an identity). A Bayesian Network model enables inferences on the data provided by the proposed PGP extension. Simulations and an agent-based platform are used to validate the concepts.


A Noisy-OR Model for Continuous Time Bayesian Networks

AAAI Conferences

A continuous time Bayesian network is a graphical model capable of describing discrete state systems that evolve in continuous time. Unfortunately, the number of parameters required for each node in the graph is exponential in the number of parents of the node, which can be prohibitively large for many real-world systems. To mitigate this problem, we propose a Noisy-OR model for continuous time Bayesian networks, which can reduce the number of required parameters from exponential to linear. We describe the model, as well as the process required to compute the remaining unspecified parameters. Finally, we experimentally validate the correctness of the proposed Noisy-OR formulation.


Bayesian Networks with Conditional Truncated Densities

AAAI Conferences

The majority of Bayesian networks learning and inference algorithms rely on the assumption that all random variables are discrete, which is not necessarily the case in real-world problems. In situations where some variables are continuous, a trade-off between the expressive power of the model and the computational complexity of inference has to be done: on one hand, conditional Gaussian models are computationally efficient but they lack expressive power; on the other hand, mixtures of exponentials (MTE), bases or polynomials are expressive but this comes at the expense of tractability. In this paper, we propose an alternative model that lies in between. It is composed of a "discrete" Bayesian network (BN) combined with a set of monodimensional conditional truncated densities modeling the uncertainty over the continuous random variables given their discrete counterpart resulting from a discretization process. We show that inference computation times in this new model are close to those in discrete BNs. Experiments confirm the tractability of the model and highlight its expressive power by comparing it with MTE.


Testing Independencies in Bayesian Networks with i-Separation

AAAI Conferences

Testing independencies in Bayesian networks (BNs) is a fundamental task in probabilistic reasoning. In this paper, we propose inaugural-separation (i-separation) as a new method for testing independencies in BNs. We establish the correctness of i-separation. Our method has several theoretical and practical advantages. There are at least five ways in which i-separation is simpler than d-separation, the classical method for testing independencies in BNs, of which the most important is that "blocking" works in an intuitive fashion. In practice, our empirical evaluation shows that i-separation tends to be faster than d-separation in large BNs.


Bayesian Network Inference with Simple Propagation

AAAI Conferences

We propose Simple Propagation (SP) as a new join tree propagation algorithm for exact inference in discrete Bayesian networks. We establish the correctness of SP. The striking feature of SP is that its message construction exploits the factorization of potentials at a sending node, but without the overhead of building and examining graphs as done in Lazy Propagation (LP). Experimental results on numerous benchmark Bayesian networks show that SP is often faster than LP.


A Dynamic Bayesian Network for Diagnosing Nuclear Power Plant Accidents

AAAI Conferences

When a severe nuclear power plant accident occurs, plant operators rely on Severe Accident Management Guidelines (SAMGs). However, current SAMGs are limited in scope and depth. The plant operators must work to mitigate the accident with limited experience and guidance for the situation. The SMART (Safely Managing Accidental Reactor Transients) procedures framework aims to fill the need for detailed guidance by creating a comprehensive probabilistic model, using a Dynamic Bayesian Network, to aid in the diagnosis of the reactor’s state. In this paper, we explore the viability of the proposed SMART proceedures approach by building a prototype Bayesian network that allows for the diagnosis of two types of accidents based on a comprehensive data set. We use Kullback-Leibler (K-L) divergence to gauge the relative importance of each of the plant’s parameters. We compare accuracy and F-score measures across four different Bayesian networks: a baseline network that ignores observation variables, a network that ignores data from the observation variable with the highest K-L score, a network that ignores data from the variable with the lowest K-L score, and finally a network that includes all observation variable data. We conclude with an interpretation of these results for SMART procedures.


A summary on Maximum likelihood Estimator

@machinelearnbot

A general method of building a predictive model requires least square estimation at first. Then we need work on the residuals, find the confidence interval of parameters and test how well the model fits the data which are based on the normally distributed assumption of the residuals (or noises). But unfortunately the assumption is not guaranteed. Most of the time, you will have a graph of residuals that looks like another distribution rather than the normal. At this moment you could add one more factor term to your model so as to filter out the non-normal distributed noise, and then calculate the LSE again.


Sir Bayes: all but not naïve! - Quantdare

#artificialintelligence

Is it possible to classify and predict (yes, predict!) if market trends will be bullish, bear or ranged by using a method called "naïve" and based on something as simple as Bayes' theorem is? Let's see! Our main objective is to explore techniques of machine learning that can help us not only to label series in a posteriori analysis, but also to predict to which class a new value given of the serie belongs to. The Naïve Bayesian Classifier is a supervised learning method of machine learning as well as a statistical method for classification. Although this method is including in its name a word as rare as "naïve" is, it will be our tool chosen to predict different trends of a market represented by an index. Bayesian classification provides practical learning algorithms where prior knowledge and observed data can be combined.


Recognizing Snacks using SimpleCV

#artificialintelligence

This article aims to provide the basic knowledge of how to recognize snacks by using Python and SimpleCV. Readers will gain practical programming knowledge via experimentation with the Python scripts included in the Snack Classifier open source project. To illustrate with a snacks recognition app, the Snack Watcher watches any snacks present on the snack table. For Snack Watcher to determine if there was an interesting event, it needs to process the image into a set of image "Blobs". For each "Blob", Snack Watcher compares the "Blob" with it's previous state to determine if the "Blob" was added, removed or stationary.