Directed Networks
Naïve-Bayes Technique for Machine Learning Blog - BRIDGEi2i Analytics Solutions
"We are to admit no more causes of natural things than such as are both true and sufficient to explain their appearances." "When you have two competing theories that make exactly the same predictions, the simpler one is the better." One famous example of Occam's Razor in action is found in conspiracy theories surrounding the NASA moon landings. Many conspiracy theorists believe that the first Moon Landing was staged and filmed in a studio, part of an elaborate hoax. Their justification relies upon many twisted and convoluted theories, whereas the NASA argument is fairly straightforward.
Predicting with confidence: the best machine learning idea you never heard of
One of the disadvantages of machine learning as a discipline is the lack of reasonable confidence intervals on a given prediction. There are all kinds of reasons you might want such a thing, but I think machine learning and data science practitioners are so drunk with newfound powers, they forget where such a thing might be useful. If you're really confident, for example, that someone will click on an ad, you probably want to serve one that pays a nice click through rate. If you have some kind of gambling engine, you want to bet more money on the predictions you are more confident of. Or if you're diagnosing an illness in a patient, it would be awfully nice to be able to tell the patient how certain you are of the diagnosis and what the confidence in the prognosis is. There are various ad hoc ways that people do this sort of thing.
Artificial Intelligence Neural Networks
Yet another research area in AI, neural networks, is inspired from the natural neural network of human nervous system. What are Artificial Neural Networks (ANNs)? The inventor of the first neurocomputer, Dr. Robert Hecht-Nielsen, defines a neural network as The idea of ANNs is based on the belief that working of human brain by making the right connections, can be imitated using silicon and wires as living neurons and dendrites. The human brain is composed of 100 billion nerve cells called neurons. They are connected to other thousand cells by Axons. Stimuli from external environment or inputs from sensory organs are accepted by dendrites. These inputs create electric impulses, which quickly travel through the neural network.
The best kept secret about linear and logistic regression
All the regression theory developed by statisticians over the last 200 years (related to the general linear model) is useless. Regression can be performed as accurately without statistical models, including the computation of confidence intervals (for estimates, predicted values or regression parameters). The non-statistical approach is also more robust than theory described in all statistics textbooks and taught in all statistical courses. It does not require Map-Reduce when data is really big, nor any matrix inversion, maximum likelihood estimation, or mathematical optimization (Newton algorithm). It is indeed incredibly simple, robust, easy to interpret, and easy to code (no statistical libraries required).
Distributed Gaussian Learning over Time-varying Directed Graphs
Nedić, Angelia, Olshevsky, Alex, Uribe, César A.
The analysis of distributed (non-Bayesian) learning algorithm gained popularity since the seminal work of Jadbabaie et al. [1]. The ability of non-Bayesian updates to combine distributed optimization and learning algorithms make them especially useful for the design of distributed estimation algorithms with provable performance. In the distributed learning setup, a group of agents repeatedly receive signals about a certain unknown state of the world or parameter. No single agent has enough information to accurately estimate the unknown state and, thus, interaction with other agents is needed. Several results are readily available for performance evaluation of distributed learning algorithms for a variety of scenarios.
Lessons from Bayesian disease diagnosis: Don't over-interpret the Bayes factor, VERSION 2
This revision has corrected derivations, new R/JAGS code, and new diagrams.] Overview "Captain, the prior probability of this character dying and leaving the show is infinitesimal." A primary example of Bayes' rule is for disease diagnosis (or illicit drug screening). The example is invoked routinely to explain the importance of prior probabilities. Here's one version of it: Suppose a diagnostic test has a 97% detection rate and a 5% false alarm rate.
A Nonparametric Latent Factor Model For Location-Aware Video Recommendations
We are interested in learning customers' video preferences from their historic viewing patterns and geographical location. We consider a Bayesian latent factor modeling approach for this task. In order to tune the complexity of the model to best represent the data, we make use of Bayesian nonparameteric techniques. We describe an inference technique that can scale to large real-world data sets. Finally we show results obtained by applying the model to a large internal Netflix data set, that illustrates that the model was able to capture interesting relationships between viewing patterns and geographical location.
Structured Inference Networks for Nonlinear State Space Models
Krishnan, Rahul G., Shalit, Uri, Sontag, David
Gaussian state space models have been used for decades as generative models of sequential data. They admit an intuitive probabilistic interpretation, have a simple functional form, and enjoy widespread adoption. We introduce a unified algorithm to efficiently learn a broad class of linear and non-linear state space models, including variants where the emission and transition distributions are modeled by deep neural networks. Our learning algorithm simultaneously learns a compiled inference network and the generative model, leveraging a structured variational approximation parameterized by recurrent neural networks to mimic the posterior distribution. We apply the learning algorithm to both synthetic and real-world datasets, demonstrating its scalability and versatility. We find that using the structured approximation to the posterior results in models with significantly higher held-out likelihood.
Need for DYNAMICAL Machine Learning: Bayesian exact recursive estimation
In my recent blog, Marrying Kalman Filtering & Machine Learning, we saw the merger of Bayesian exact recursive estimation (algorithm for which is Kalman Filter/Smoother in the linear, Gaussian case) and Machine Learning. We developed a solution called Kernel Projection Kalman Filter for business applications that require static or dynamical, dynamical or time-varying dynamical, linear or non-linear Machine Learning, i.e., pretty much all applications - therefore, Kernel Projection Kalman Filter is a "universal" solution . . . Indeed, university courses in ML largely teach static ML. Given a set of inputs and outputs, find a static map between the two during supervised "Training" and use this static map for business purposes during "Operation" (which is called "Testing" during pre-operation evaluation). In real life, static is hardly the case ... Before we proceed further, it will be useful to review my blog, "Prediction – the other dismal science?",
The Mathematics of Machine Learning
In the last few months, I have had several people contact me about their enthusiasm for venturing into the world of data science and using Machine Learning (ML) techniques to probe statistical regularities and build impeccable data-driven products. However, I've observed that some actually lack the necessary mathematical intuition and framework to get useful results. This is the main reason I decided to write this blog post. Recently, there has been an upsurge in the availability of many easy-to-use machine and deep learning packages such as scikit-learn, Weka, Tensorflow etc. Machine Learning theory is a field that intersects statistical, probabilistic, computer science and algorithmic aspects arising from learning iteratively from data and finding hidden insights which can be used to build intelligent applications. Despite the immense possibilities of Machine and Deep Learning, a thorough mathematical understanding of many of these techniques is necessary for a good grasp of the inner workings of the algorithms and getting good results.