Goto

Collaborating Authors

 Deep Learning


Stochastic Training of Neural Networks via Successive Convex Approximations

arXiv.org Machine Learning

This paper proposes a new family of algorithms for training neural networks (NNs). These are based on recent developments in the field of non-convex optimization, going under the general name of successive convex approximation (SCA) techniques. The basic idea is to iteratively replace the original (non-convex, highly dimensional) learning problem with a sequence of (strongly convex) approximations, which are both accurate and simple to optimize. Differently from similar ideas (e.g., quasi-Newton algorithms), the approximations can be constructed using only first-order information of the neural network function, in a stochastic fashion, while exploiting the overall structure of the learning problem for a faster convergence. We discuss several use cases, based on different choices for the loss function (e.g., squared loss and cross-entropy loss), and for the regularization of the NN's weights. We experiment on several medium-sized benchmark problems, and on a large-scale dataset involving simulated physical data. The results show how the algorithm outperforms state-of-the-art techniques, providing faster convergence to a better minimum. Additionally, we show how the algorithm can be easily parallelized over multiple computational units without hindering its performance. In particular, each computational unit can optimize a tailored surrogate function defined on a randomly assigned subset of the input variables, whose dimension can be selected depending entirely on the available computational power.


Deep adversarial neural decoding

arXiv.org Machine Learning

Here, we present a novel approach to solve the problem of reconstructing perceived stimuli from brain responses by combining probabilistic inference with deep learning. Our approach first inverts the linear transformation from latent features to brain responses with maximum a posteriori estimation and then inverts the nonlinear transformation from perceived stimuli to latent features with adversarial training of convolutional neural networks. We test our approach with a functional magnetic resonance imaging experiment and show that it can generate state-of-the-art reconstructions of perceived faces from brain activations.


Deep Clustering and Conventional Networks for Music Separation: Stronger Together

arXiv.org Machine Learning

Deep clustering is the first method to handle general audio separation scenarios with multiple sources of the same type and an arbitrary number of sources, performing impressively in speaker-independent speech separation tasks. However, little is known about its effectiveness in other challenging situations such as music source separation. Contrary to conventional networks that directly estimate the source signals, deep clustering generates an embedding for each time-frequency bin, and separates sources by clustering the bins in the embedding space. We show that deep clustering outperforms conventional networks on a singing voice separation task, in both matched and mismatched conditions, even though conventional networks have the advantage of end-to-end training for best signal approximation, presumably because its more flexible objective engenders better regularization. Since the strengths of deep clustering and conventional network architectures appear complementary, we explore combining them in a single hybrid network trained via an approach akin to multi-task learning. Remarkably, the combination significantly outperforms either of its components.


A Simple and Accurate Syntax-Agnostic Neural Model for Dependency-based Semantic Role Labeling

arXiv.org Artificial Intelligence

We introduce a simple and accurate neural model for dependency-based semantic role labeling. Our model predicts predicate-argument dependencies relying on states of a bidirectional LSTM encoder. The semantic role labeler achieves competitive performance on English, even without any kind of syntactic information and only using local inference. However, when automatically predicted part-of-speech tags are provided as input, it substantially outperforms all previous local models and approaches the best reported results on the English CoNLL-2009 dataset. We also consider Chinese, Czech and Spanish where our approach also achieves competitive results. Syntactic parsers are unreliable on out-of-domain data, so standard (i.e., syntactically-informed) SRL models are hindered when tested in this setting. Our syntax-agnostic model appears more robust, resulting in the best reported results on standard out-of-domain test sets.



Getting Started with Deep Learning

@machinelearnbot

This article was written by Matthew Rubashkin. With a background in optical physics and biomedical research, Matthew has a broad range of experiences in software development, database engineering, and data analytics. At SVDS, our R&D team has been investigating different deep learning technologies, from recognizing images of trains to speech recognition. We needed to build a pipeline for ingesting data, creating a model, and evaluating the model performance. However, when we researched what technologies were available, we could not find a concise summary document to reference for starting a new deep learning project.


This robot uses deep learning to write and play its own music

#artificialintelligence

Artificial intelligence has proved itself incredibly capable of analysing images, now its getting rhythm in the form of a four-armed, marimba-playing robot. The robot, named Shimon, was given a vast amount of musical data: more than 5,000 complete songs, two million motifs, riffs and short passages of music by researchers at Georgia Institute of Technology. It was then asked to compose and perform its own music. It's been in development for some years, but this is the first time it has composed its own music. Once it had been fed the data it was able to use deep learning techniques to create two 30 second pieces of original music.


Drilling Down into Machine Learning and Deep Learning

#artificialintelligence

Deep learning in turn, is subclass of machine learning that creates machines that use methods originally inspired by how a cat's brain reacted with light signals and then generalized to mimic the human brain's ability to learn. Until recently, we simply didn't have enough data and proces- sing power to train a machine to learn. Deep neural networks (DNNs) learn at many levels of abstraction, ranging from simple concepts to complex ones. This is what designates the "deep" in deep learning. Each layer in the neural network categorizes some kind of information, refines it, and passes it along to the next layer.


ElementAI raises historic $137.5 million Series A round

#artificialintelligence

Element AI, the Montreal-based artificial intelligence (AI) powerhouse, today announced it has raised $137.5 million ($102 million USD) in Series A funding, the largest in history for an AI company. The round was led by Data Collective (DCVC) with further investments from Real Ventures, Development Bank of Canada (BDC), Fidelity Investments Canada, Hanwha Investment, Intel Capital, Microsoft Ventures, National Bank of Canada, NVIDIA, Tencent, and several of the world's largest sovereign wealth funds. READ ALSO: Yoshua Bengio and friends launch'AI startup factory' The funding will allow Element AI to invest in large-scale AI projects internationally, solidifying its position as the largest global AI company in Canada and creating 250 jobs in the Canadian high tech sector by January 2018. Co-founded by serial entrepreneurs Jean-François Gagné and Nicolas Chapados, Real Ventures and Yoshua Bengio, a co-father of deep learning technology, Element AI aims to bring academic AI innovation to global organizations. Started in October 2016 to empower industry with the massive scale of academic AI innovation Bengio was driving at the world-leading Montreal Institute of Learning Algorithms (MILA), the two groups pioneered a unique, non-exploitative model of academic cooperation they have since replicated at many other institutes.


Forget AlphaGo, DeepMind has a more interesting step toward general AI

#artificialintelligence

AlphaGo and self-driving cars are amazingly clever, but neither represents a very big leap toward general artificial intelligence. Fortunately, some AI researchers are developing ways of broadening machine intelligence. The researchers at DeepMind, which created the champion Go-playing robot AlphaGo, are working on an approach that could prove significant in the quest to make machines as intelligent as we are. In two papers published this week and reported by New Scientist, researchers at the Alphabet subsidiary describe efforts to teach computers about relational reasoning, a cognitive capability that is foundational to human intelligence. Simply put, relational reasoning is the ability to consider relationships between different mental representations, such as objects, words, or ideas.