Goto

Collaborating Authors

 Deep Learning


Vector Institute on Twitter

#artificialintelligence

The Vector Institute is dedicated to the transformative field of artificial intelligence, excelling in machine and deep learning research.


We're Already in DEEP War with AI

#artificialintelligence

Thank you for your support! Artificial Intelligence hasn't demonstrated clear consciousness or sentience as far as we know. But with the infiltration of the deep mind knowledge base, an artificial intelligence able to write its own code and learn, not unlike biological creatures, has risen online. Our interactions with the various tentacles of this deep intelligence indicates that what Paul told us in Ephesians chapter 6, that our war is not against flesh and blood, but with a hierarchy of non-corporeal intelligent entities. An unseen entity that exists in cyberspace can very much be a demonic entity making decisions through the AI. And while it all sounds too fantastic, it ought to be alarming that a recent MIT article admits that everyone from Nvidia, to DARPA has no clue how the AI is conducting its reasoning.


Scaling machine learning

#artificialintelligence

Reza Zadeh is giving a talk, "Scaling computer vision in the cloud," at the O'Reilly Artificial Intelligence Conference, June 26-29, 2017, in New York City. Subscribe to the O'Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS. In this episode of the Data Show, I spoke with Reza Zadeh, adjunct professor at Stanford University, co-organizer of ScaledML, and co-founder of Matroid, a startup focused on commercial applications of deep learning and computer vision. Zadeh also is the co-author of the forthcoming book TensorFlow for Deep Learning (now in early release).


Impact of deep learning on computer vision

#artificialintelligence

A rather high profile area generating headlines this year has been connected vehicles. The technological challenges that must be addressed before autonomous cars can be unleashed onto the streets are quite significant. Vision is one critical factor; your car needs to be able to identify all road hazards as well as navigating from A to B. So, how can a car achieve that in an often over-crowded highway space? Computer vision can be described as graphics in reverse. Rather than us viewing the computer's world, the computer turns around to look at ours.


AI Nanodegree Program Syllabus: Term 2 (Deep Learning), In Depth

#artificialintelligence

Here at Udacity, we are tremendously excited to announce the kick-off of the second term of our Artificial Intelligence Nanodegree program. Because we are able to provide a depth of education that is commensurate with university education; because we are bridging the gap between universities and industry by providing you with hands-on projects and partnering with the top industries in the field; and last but certainly not least, because we are able to bring this education to many more people across the globe, at a cost that makes a top-notch AI education realistic for all aspiring learners. During the first term, you've enjoyed learning about Game Playing Agents, Simulated Annealing, Constraint Satisfaction, Logic and Planning, and Probabilistic AI from some of the biggest names in the field: Sebastian Thrun, Peter Norvig, and Thad Starner. Term 2 will be focused on one of the cutting-edge advancements of AI -- Deep Learning. In this Term, you will learn about the foundations of neural networks, understand how to train these neural networks with techniques such as gradient descent and backpropagation, and learn different types of architectures that make neural networks work for a variety of different applications.


Bandit Structured Prediction for Neural Sequence-to-Sequence Learning

arXiv.org Machine Learning

Bandit structured prediction describes a stochastic optimization framework where learning is performed from partial feedback. This feedback is received in the form of a task loss evaluation to a predicted output structure, without having access to gold standard structures. We advance this framework by lifting linear bandit learning to neural sequence-to-sequence learning problems using attention-based recurrent neural networks. Furthermore, we show how to incorporate control variates into our learning algorithms for variance reduction and improved generalization. We present an evaluation on a neural machine translation task that shows improvements of up to 5.89 BLEU points for domain adaptation from simulated bandit feedback.


Scatteract: Automated extraction of data from scatter plots

arXiv.org Machine Learning

Charts are an excellent way to convey patterns and trends in data, but they do not facilitate further modeling of the data or close inspection of individual data points. We present a fully automated system for extracting the numerical values of data points from images of scatter plots. We use deep learning techniques to identify the key components of the chart, and optical character recognition together with robust regression to map from pixels to the coordinate system of the chart. We focus on scatter plots with linear scales, which already have several interesting challenges. Previous work has done fully automatic extraction for other types of charts, but to our knowledge this is the first approach that is fully automatic for scatter plots. Our method performs well, achieving successful data extraction on 89% of the plots in our test set.


Estimating Nonlinear Dynamics with the ConvNet Smoother

arXiv.org Machine Learning

Estimating the state of a dynamical system from a series of noise-corrupted observations is fundamental in many areas of science and engineering. The most well-known method, the Kalman smoother (and the related Kalman filter), relies on assumptions of linearity and Gaussianity that are rarely met in practice. In this paper, we introduced a new dynamical smoothing method that exploits the remarkable capabilities of convolutional neural networks to approximate complex non-linear functions. The main idea is to generate a training set composed of both latent states and observations from an ensemble of simulators and to train the deep network to recover the former from the latter. Importantly, this method only requires the availability of the simulators and can therefore be applied in situations in which either the latent dynamical model or the observation model cannot be easily expressed in closed form. In our simulation studies, we show that the resulting ConvNet smoother has almost optimal performance in the Gaussian case even when the parameters are unknown. Furthermore, the method can be successfully applied to extremely non-linear and non-Gaussian systems. Finally, we empirically validate our approach via the analysis of measured brain signals.


Entropy-SGD: Biasing Gradient Descent Into Wide Valleys

arXiv.org Machine Learning

This paper proposes a new optimization algorithm called Entropy-SGD for training deep neural networks that is motivated by the local geometry of the energy landscape. Local extrema with low generalization error have a large proportion of almost-zero eigenvalues in the Hessian with very few positive or negative eigenvalues. We leverage upon this observation to construct a local-entropy-based objective function that favors well-generalizable solutions lying in large flat regions of the energy landscape, while avoiding poorly-generalizable solutions located in the sharp valleys. Conceptually, our algorithm resembles two nested loops of SGD where we use Langevin dynamics in the inner loop to compute the gradient of the local entropy before each update of the weights. We show that the new objective has a smoother energy landscape and show improved generalization over SGD using uniform stability, under certain assumptions. Our experiments on convolutional and recurrent networks demonstrate that Entropy-SGD compares favorably to state-of-the-art techniques in terms of generalization error and training time.


Discrete Variational Autoencoders

arXiv.org Machine Learning

Probabilistic models with discrete latent variables naturally capture datasets composed of discrete classes. However, they are difficult to train efficiently, since backpropagation through discrete variables is generally not possible. We present a novel method to train a class of probabilistic models with discrete latent variables using the variational autoencoder framework, including backpropagation through the discrete latent variables. The associated class of probabilistic models comprises an undirected discrete component and a directed hierarchical continuous component. The discrete component captures the distribution over the disconnected smooth manifolds induced by the continuous component. As a result, this class of models efficiently learns both the class of objects in an image, and their specific realization in pixels, from unsupervised data, and outperforms state-of-the-art methods on the permutation-invariant MNIST, Omniglot, and Caltech-101 Silhouettes datasets.