Neural Networks: Overviews


Takeaways from the Google Speech Summit 2018

@machinelearnbot

Generative Text-to-Speech Synthesis, Heiga Zen, Research Scientist Abstract: Recent progress in deep generative models and its application to text-to-speech (TTS) synthesis has made a breakthrough in the naturalness of artificially generated speech.



A Guide to Machine Learning PhDs

#artificialintelligence

A machine learning learning PhD doesn't only open up some of the highest-paying jobs around, it sets you up to have an outsized positive impact on the world. This comprehensive guide on machine learning PhDs from 80,000 Hours (YC S15) will help you get started. The guide is based on discussion with six machine learning researchers including two at DeepMind, one at OpenAI, and one running a robotics start-up. Check out the highlights below. Machine learning involves giving software rules to learn from experience rather than directly programming the steps it takes.



Dense Adaptive Cascade Forest: A Densely Connected Deep Ensemble for Classification Problems

arXiv.org Machine Learning

Recent research has shown that deep ensemble for forest can achieve a huge increase in classification accuracy compared with the general ensemble learning method. Especially when there are only few training data. In this paper, we decide to take full advantage of this observation and introduce the Dense Adaptive Cascade Forest (daForest), which has better performance than the original one named Cascade Forest. And it is particularly noteworthy that daForest has a powerful ability to handle high-dimensional sparse data without any preprocessing on raw data like PCA or any other dimensional reduction methods. Our model is distinguished by three major features: the first feature is the combination of the SAMME.R boosting algorithm in the model, boosting gives the model the ability to continuously improve as the number of layer increases, which is not possible in stacking model or plain cascade forest. The second feature is our model connects each layer to its subsequent layers in a feed-forward fashion, to some extent this structure enhances the ability of the model to resist degeneration. When number of layers goes up, accuracy of model goes up a little in the first few layers then drop down quickly, we call this phenomenon degeneration in training stacking model. The third feature is that we add a hyper-parameter optimization layer before the first classification layer in the proposed deep model, which can search for the optimal hyper-parameter and set up the model in a brief period and nearly halve the training time without having too much impact on the final performance. Experimental results show that daForest performs particularly well on both high-dimensional low-order features and low-dimensional high-order features, and in some cases, even better than neural networks and achieves state-of-the-art results.


A Guide to Sequence Prediction using Compact Prediction Tree (with codes in Python)

#artificialintelligence

Sequence prediction is one of the hottest application of Deep Learning these days. From building recommendation systems to speech recognition and natural language processing, its potential is seemingly endless. This is enabling never-thought-before solutions to emerge in the industry and is driving innovation. There are many different ways to perform sequence prediction such as using Markov models, Directed Graphs etc. from the Machine Learning domain and RNNs/LSTMs from the Deep Learning domain. In this article, we will see how we can perform sequence prediction using a relatively unknown algorithm called Compact Prediction Tree (CPT).


Opening the black box of neural nets: case studies in stop/top discrimination

arXiv.org Machine Learning

We introduce techniques for exploring the functionality of a neural network and extracting simple, human-readable approximations to its performance. By performing gradient ascent on the input space of the network, we are able to produce large populations of artificial events which strongly excite a given classifier. By studying the populations of these events, we then directly produce what are essentially contour maps of the network's classification function. Combined with a suite of tools for identifying the input dimensions deemed most important by the network, we can utilize these maps to efficiently interpret the dominant criteria by which the network makes its classification. As a test case, we study networks trained to discriminate supersymmetric stop production in the dilepton channel from Standard Model backgrounds. In the case of a heavy stop decaying to a light neutralino, we find individual neurons with large mutual information with $m_{T2}^{\ell\ell}$, a human-designed variable for optimizing the analysis. The network selects events with significant missing $p_T$ oriented azimuthally away from both leptons, efficiently rejecting $t\overline{t}$ background. In the case of a light stop with three-body decays to $Wb{\widetilde \chi}$ and little phase space, we find neurons that smoothly interpolate between a similar top-rejection strategy and an ISR-tagging strategy allowing for more missing momentum. We also find that a neural network trained on a stealth stop parameter point learns novel angular correlations.


Artificial Intelligence, Deep Learning, and Neural Networks explained

#artificialintelligence

Artificial intelligence (AI), deep learning, and neural networks represent incredibly exciting and powerful machine learning-based techniques used to solve many real-world problems. For a primer on machine learning, you may want to read this five-part series that I wrote. While human-like deductive reasoning, inference, and decision-making by a computer is still a long time away, there have been remarkable gains in the application of AI techniques and associated algorithms. The concepts discussed here are extremely technical, complex, and based on mathematics, statistics, probability theory, physics, signal processing, machine learning, computer science, psychology, linguistics, and neuroscience. That said, this article is not meant to provide such a technical treatment, but rather to explain these concepts at a level that can be understood by most non-practitioners, and can also serve as a reference or review for technical folks as well.


A Guide to TensorFlow (Part 1)

#artificialintelligence

TensorFlow is an open source software library for numerical computation using data flow graphs. It is an extremely popular symbolic math library and is widely used for machine learning applications such as neural networks. This blog is a part of "A Guide To TensorFlow", where we will explore the TensorFlow API and use it to build multiple machine learning models for real- life examples. In this blog we shall uncover TensorFlow Graph, understand the concept of Tensors and also explore TensorFlow data types. At the heart of a TensorFlow program is the computation graph described in code.


16 Free Machine Learning Books

#artificialintelligence

The following is a list of free books on Machine Learning. A Brief Introduction To Neural Networks provides a comprehensive overview of the subject of neural networks and is divided into 4 parts –Part I: From Biology to Formalization -- Motivation, Philosophy, History and Realization of Neural Models,Part II: Supervised learning Network Paradigms, Part III: Unsupervised learning Network Paradigms and Part IV: Excursi, Appendices and Registers. A Course In Machine Learning is designed to provide a gentle and pedagogically organized introduction to the field and provide a view of machine learning that focuses on ideas and models, not on math. The audience of this book is anyone who knows differential calculus and discrete math, and can program reasonably well. An undergraduate in their fourth or fifth semester should be fully capable of understanding this material.