Goto

Collaborating Authors

 Learning Graphical Models


Artificial Intelligence Neural Networks

#artificialintelligence

Yet another research area in AI, neural networks, is inspired from the natural neural network of human nervous system. What are Artificial Neural Networks (ANNs)? The inventor of the first neurocomputer, Dr. Robert Hecht-Nielsen, defines a neural network as The idea of ANNs is based on the belief that working of human brain by making the right connections, can be imitated using silicon and wires as living neurons and dendrites. The human brain is composed of 100 billion nerve cells called neurons. They are connected to other thousand cells by Axons. Stimuli from external environment or inputs from sensory organs are accepted by dendrites. These inputs create electric impulses, which quickly travel through the neural network.


The best kept secret about linear and logistic regression

@machinelearnbot

All the regression theory developed by statisticians over the last 200 years (related to the general linear model) is useless. Regression can be performed as accurately without statistical models, including the computation of confidence intervals (for estimates, predicted values or regression parameters). The non-statistical approach is also more robust than theory described in all statistics textbooks and taught in all statistical courses. It does not require Map-Reduce when data is really big, nor any matrix inversion, maximum likelihood estimation, or mathematical optimization (Newton algorithm). It is indeed incredibly simple, robust, easy to interpret, and easy to code (no statistical libraries required).


Deep Learning: Recurrent Neural Networks in Python

@machinelearnbot

Like the course I just released on Hidden Markov Models, Recurrent Neural Networks are all about learning sequences - but whereas Markov Models are limited by the Markov assumption, Recurrent Neural Networks are not - and as a result, they are more expressive, and more powerful than anything we've seen on tasks that we haven't made progress on in decades. So what's going to be in this course and how will it build on the previous neural network courses and Hidden Markov Models? In the first section of the course we are going to add the concept of time to our neural networks. I'll introduce you to the Simple Recurrent Unit, also known as the Elman unit. We are going to revisit the XOR problem, but we're going to extend it so that it becomes the parity problem - you'll see that regular feedforward neural networks will have trouble solving this problem but recurrent networks will work because the key is to treat the input as a sequence.


Segmental Convolutional Neural Networks for Detection of Cardiac Abnormality With Noisy Heart Sound Recordings

arXiv.org Machine Learning

Heart diseases constitute a global health burden, and the problem is exacerbated by the error-prone nature of listening to and interpreting heart sounds. This motivates the development of automated classification to screen for abnormal heart sounds. Existing machine learning-based systems achieve accurate classification of heart sound recordings but rely on expert features that have not been thoroughly evaluated on noisy recordings. Here we propose a segmental convolutional neural network architecture that achieves automatic feature learning from noisy heart sound recordings. Our experiments show that our best model, trained on noisy recording segments acquired with an existing hidden semi-markov model-based approach, attains a classification accuracy of 87.5% on the 2016 PhysioNet/CinC Challenge dataset, compared to the 84.6% accuracy of the state-of-the-art statistical classifier trained and evaluated on the same dataset. Our results indicate the potential of using neural network-based methods to increase the accuracy of automated classification of heart sound recordings for improved screening of heart diseases.


Distributed Gaussian Learning over Time-varying Directed Graphs

arXiv.org Machine Learning

The analysis of distributed (non-Bayesian) learning algorithm gained popularity since the seminal work of Jadbabaie et al. [1]. The ability of non-Bayesian updates to combine distributed optimization and learning algorithms make them especially useful for the design of distributed estimation algorithms with provable performance. In the distributed learning setup, a group of agents repeatedly receive signals about a certain unknown state of the world or parameter. No single agent has enough information to accurately estimate the unknown state and, thus, interaction with other agents is needed. Several results are readily available for performance evaluation of distributed learning algorithms for a variety of scenarios.


Lessons from Bayesian disease diagnosis: Don't over-interpret the Bayes factor, VERSION 2

#artificialintelligence

This revision has corrected derivations, new R/JAGS code, and new diagrams.] Overview "Captain, the prior probability of this character dying and leaving the show is infinitesimal." A primary example of Bayes' rule is for disease diagnosis (or illicit drug screening). The example is invoked routinely to explain the importance of prior probabilities. Here's one version of it: Suppose a diagnostic test has a 97% detection rate and a 5% false alarm rate.


A Nonparametric Latent Factor Model For Location-Aware Video Recommendations

arXiv.org Machine Learning

We are interested in learning customers' video preferences from their historic viewing patterns and geographical location. We consider a Bayesian latent factor modeling approach for this task. In order to tune the complexity of the model to best represent the data, we make use of Bayesian nonparameteric techniques. We describe an inference technique that can scale to large real-world data sets. Finally we show results obtained by applying the model to a large internal Netflix data set, that illustrates that the model was able to capture interesting relationships between viewing patterns and geographical location.


Structured Inference Networks for Nonlinear State Space Models

arXiv.org Machine Learning

Gaussian state space models have been used for decades as generative models of sequential data. They admit an intuitive probabilistic interpretation, have a simple functional form, and enjoy widespread adoption. We introduce a unified algorithm to efficiently learn a broad class of linear and non-linear state space models, including variants where the emission and transition distributions are modeled by deep neural networks. Our learning algorithm simultaneously learns a compiled inference network and the generative model, leveraging a structured variational approximation parameterized by recurrent neural networks to mimic the posterior distribution. We apply the learning algorithm to both synthetic and real-world datasets, demonstrating its scalability and versatility. We find that using the structured approximation to the posterior results in models with significantly higher held-out likelihood.


Need for DYNAMICAL Machine Learning: Bayesian exact recursive estimation

@machinelearnbot

In my recent blog, Marrying Kalman Filtering & Machine Learning, we saw the merger of Bayesian exact recursive estimation (algorithm for which is Kalman Filter/Smoother in the linear, Gaussian case) and Machine Learning. We developed a solution called Kernel Projection Kalman Filter for business applications that require static or dynamical, dynamical or time-varying dynamical, linear or non-linear Machine Learning, i.e., pretty much all applications - therefore, Kernel Projection Kalman Filter is a "universal" solution . . . Indeed, university courses in ML largely teach static ML. Given a set of inputs and outputs, find a static map between the two during supervised "Training" and use this static map for business purposes during "Operation" (which is called "Testing" during pre-operation evaluation). In real life, static is hardly the case ... Before we proceed further, it will be useful to review my blog, "Prediction – the other dismal science?",


Is deep learning a Markov chain in disguise?

@machinelearnbot

Andrej Karpathy's post "The Unreasonable Effectiveness of Recurrent Neural Networks" made splashes last year. The basic premise is that you can create a recurrent neural network to learn language features character-by-character. But is the resultant model any different from a Markov chain built for the same purpose? I implemented a character-by-character Markov chain in R to find out. First, let's play a variation of the Imitation Game with generated text from Karpathy's tinyshakespeare dataset.