Deep Learning
Machine Learning Engineer
At HyperScience we bring AI to the office. Our products help enterprises and government institutions function by automating certain kinds of office work and reducing bureaucratic burden both on businesses and their customers. We take a heterogeneous approach to AI, using a blend of what are traditionally considered different fields of ML: deep learning, computer vision, and NLP among others. We believe that AI is destined to be the biggest event in the history of human labor since the Industrial Revolution, and we want to be a part of it. ML is at the core of what we do.
Max-Pooling Loss Training of Long Short-Term Memory Networks for Small-Footprint Keyword Spotting
Sun, Ming, Raju, Anirudh, Tucker, George, Panchapagesan, Sankaran, Fu, Gengshen, Mandal, Arindam, Matsoukas, Spyros, Strom, Nikko, Vitaladevuni, Shiv
We propose a max-pooling based loss function for training Long Short-Term Memory (LSTM) networks for small-footprint keyword spotting (KWS), with low CPU, memory, and latency requirements. The max-pooling loss training can be further guided by initializing with a cross-entropy loss trained network. A posterior smoothing based evaluation approach is employed to measure keyword spotting performance. Our experimental results show that LSTM models trained using cross-entropy loss or max-pooling loss outperform a cross-entropy loss trained baseline feed-forward Deep Neural Network (DNN). In addition, max-pooling loss trained LSTM with randomly initialized network performs better compared to cross-entropy loss trained LSTM. Finally, the max-pooling loss trained LSTM initialized with a cross-entropy pre-trained network shows the best performance, which yields $67.6\%$ relative reduction compared to baseline feed-forward DNN in Area Under the Curve (AUC) measure.
Safety Verification of Deep Neural Networks
Huang, Xiaowei, Kwiatkowska, Marta, Wang, Sen, Wu, Min
Deep neural networks have achieved impressive experimental results in image classification, but can surprisingly be unstable with respect to adversarial perturbations, that is, minimal changes to the input image that cause the network to misclassify it. With potential applications including perception modules and end-to-end controllers for self-driving cars, this raises concerns about their safety. We develop a novel automated verification framework for feed-forward multi-layer neural networks based on Satisfiability Modulo Theory (SMT). We focus on safety of image classification decisions with respect to image manipulations, such as scratches or changes to camera angle or lighting conditions that would result in the same class being assigned by a human, and define safety for an individual decision in terms of invariance of the classification within a small neighbourhood of the original image. We enable exhaustive search of the region by employing discretisation, and propagate the analysis layer by layer. Our method works directly with the network code and, in contrast to existing methods, can guarantee that adversarial examples, if they exist, are found for the given region and family of manipulations. If found, adversarial examples can be shown to human testers and/or used to fine-tune the network. We implement the techniques using Z3 and evaluate them on state-of-the-art networks, including regularised and deep learning networks. We also compare against existing techniques to search for adversarial examples and estimate network robustness.
50 Deep Learning Software Tools and Platforms, Updated
Blocks, a Theano framework for training neural networks Caffe, a deep learning framework made with expression, speed, and modularity in mind. It can model arbitrary layer connectivity and network depth. Any directed acyclic graph of layers will do. Training is done using the back-propagation algorithm. ConvNet, a Matlab based convolutional neural network toolbox - a type of deep learning, can learn useful features from raw data by itself.
An AI can recognize musical genres better than humans
Researchers tested the AI by having a pianist play a variety of music -- baroque, classical, ragtime and jazz -- in a live demonstration. The AI then assessed the likely genre in real time, vastly outperforming conventional software hand-coded by humans. "I think the deep learning system performs better because it's had a dispassionate look at quite a lot of audio material," says Monty Barlow, director of Machine Learning at Cambridge Consultants. "It's found the best way to detect one genre from another without any prejudice or bias. It's strangely more human-like in its capabilities than our programmers were in the classical engineering approach."
#CanadaAI Summit
Canada is a world leader in business intelligence services and offers a wealth of opportunities for companies looking for AI solutions to grow their footprint in Canada. The AI Summit will be an opportunity to learn more about the economic outlook, industry trends, and Canada's deep learning talent pool / capabilities in AI. The event is open to all tech companies, entrepreneurs, VCs, and academia with an interest in AI.
Which deep learning network is best for you?
In "Big data โ a road map for smarter data," I describe a set of machine learning architectures that will provide advanced capabilities to include image, handwriting, video, and speech recognition, natural language processing and object recognition. There is no perfect deep learning network that will solve all your business problems. Hopefully, the below table with the accommodating descriptive outline will provide you insights towards the best fit for purpose framework for your business problem. The ranking is based on the number of stars awarded by developers in GitHub. The numbers were compiled at the beginning of May of 2017.
Deep Learning Based Regression and Multi-class Models for Acute Oral Toxicity Prediction with Automatic Chemical Feature Extraction
Xu, Youjun, Pei, Jianfeng, Lai, Luhua
For quantitative structure-property relationship (QSPR) studies in chemoinformatics, it is important to get interpretable relationship between chemical properties and chemical features. However, the predictive power and interpretability of QSPR models are usually two different objectives that are difficult to achieve simultaneously. A deep learning architecture using molecular graph encoding convolutional neural networks (MGE-CNN) provided a universal strategy to construct interpretable QSPR models with high predictive power. Instead of using application-specific preset molecular descriptors or fingerprints, the models can be resolved using raw and pertinent features without manual intervention or selection. In this study, we developed acute oral toxicity (AOT) models of compounds using the MGE-CNN architecture as a case study. Three types of high-level predictive models: regression model (deepAOT-R), multi-classification model (deepAOT-C) and multi-task model (deepAOT-CR) for AOT evaluation were constructed. These models highly outperformed previously reported models. For the two external datasets containing 1673 (test set I) and 375 (test set II) compounds, the R2 and mean absolute error (MAE) of deepAOT-R on the test set I were 0.864 and 0.195, and the prediction accuracy of deepAOT-C was 95.5% and 96.3% on the test set I and II, respectively. The two external prediction accuracy of deepAOT-CR is 95.0% and 94.1%, while the R2 and MAE are 0.861 and 0.204 for test set I, respectively.
Machine Learning on Sequential Data Using a Recurrent Weighted Average
Ostmeyer, Jared, Cowell, Lindsay
Recurrent Neural Networks (RNN) are a type of statistical model designed to handle sequential data. The model reads a sequence one symbol at a time. Each symbol is processed based on information collected from the previous symbols. With existing RNN architectures, each symbol is processed using only information from the previous processing step. To overcome this limitation, we propose a new kind of RNN model that computes a recurrent weighted average (RWA) over every past processing step. Because the RWA can be computed as a running average, the computational overhead scales like that of any other RNN architecture. The approach essentially reformulates the attention mechanism into a stand-alone model. The performance of the RWA model is assessed on the variable copy problem, the adding problem, classification of artificial grammar, classification of sequences by length, and classification of the MNIST images (where the pixels are read sequentially one at a time). On almost every task, the RWA model is found to outperform a standard LSTM model.
Inverse Reinforcement Learning via Deep Gaussian Process
Jin, Ming, Damianou, Andreas, Abbeel, Pieter, Spanos, Costas
We propose a new approach to inverse reinforcement learning (IRL) based on the deep Gaussian process (deep GP) model, which is capable of learning complicated reward structures with few demonstrations. Our model stacks multiple latent GP layers to learn abstract representations of the state feature space, which is linked to the demonstrations through the Maximum Entropy learning framework. Incorporating the IRL engine into the nonlinear latent structure renders existing deep GP inference approaches intractable. To tackle this, we develop a non-standard variational approximation framework which extends previous inference schemes. This allows for approximate Bayesian treatment of the feature space and guards against overfitting. Carrying out representation and inverse reinforcement learning simultaneously within our model outperforms state-of-the-art approaches, as we demonstrate with experiments on standard benchmarks ("object world","highway driving") and a new benchmark ("binary world").