Goto

Collaborating Authors

 Deep Learning


Semi-Supervised Generation with Cluster-aware Generative Models

arXiv.org Machine Learning

Deep generative models trained with large amounts of unlabelled data have proven to be powerful within the domain of unsupervised learning. Many real life data sets contain a small amount of labelled data points, that are typically disregarded when training generative models. We propose the Cluster-aware Generative Model, that uses unlabelled information to infer a latent representation that models the natural clustering of the data, and additional labelled data points to refine this clustering. The generative performances of the model significantly improve when labelled information is exploited, obtaining a log-likelihood of -79.38 nats on permutation invariant MNIST, while also achieving competitive semi-supervised classification accuracies. The model can also be trained fully unsupervised, and still improve the log-likelihood performance with respect to related methods.


Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs

arXiv.org Machine Learning

Using unitary (instead of general) matrices in artificial neural networks (ANNs) is a promising way to solve the gradient explosion/vanishing problem, as well as to enable ANNs to learn long-term correlations in the data. This approach appears particularly promising for Recurrent Neural Networks (RNNs). In this work, we present a new architecture for implementing an Efficient Unitary Neural Network (EUNNs); its main advantages can be summarized as follows. Firstly, the representation capacity of the unitary space in an EUNN is fully tunable, ranging from a subspace of SU(N) to the entire unitary space. Secondly, the computational complexity for training an EUNN is merely $\mathcal{O}(1)$ per parameter. Finally, we test the performance of EUNNs on the standard copying task, the pixel-permuted MNIST digit recognition benchmark as well as the Speech Prediction Test (TIMIT). We find that our architecture significantly outperforms both other state-of-the-art unitary RNNs and the LSTM architecture, in terms of the final performance and/or the wall-clock training speed. EUNNs are thus promising alternatives to RNNs and LSTMs for a wide variety of applications.


Configurable, Photorealistic Image Rendering and Ground Truth Synthesis by Sampling Stochastic Grammars Representing Indoor Scenes

arXiv.org Machine Learning

We propose the configurable rendering of massive quantities of photorealistic images with ground truth for the purposes of training, benchmarking, and diagnosing computer vision models. In contrast to the conventional (crowd-sourced) manual labeling of ground truth for a relatively modest number of RGB-D images captured by Kinect-like sensors, we devise a non-trivial configurable pipeline of algorithms capable of generating a potentially infinite variety of indoor scenes using a stochastic grammar, specifically, one represented by an attributed spatial And-Or graph. We employ physics-based rendering to synthesize photorealistic RGB images while automatically synthesizing detailed, per-pixel ground truth data, including visible surface depth and normal, object identity and material information, as well as illumination. Our pipeline is configurable inasmuch as it enables the precise customization and control of important attributes of the generated scenes. We demonstrate that our generated scenes achieve a performance similar to the NYU v2 Dataset on pre-trained deep learning models. By modifying pipeline components in a controllable manner, we furthermore provide diagnostics on common scene understanding tasks; eg., depth and surface normal prediction, semantic segmentation, etc.


Stick-Breaking Variational Autoencoders

arXiv.org Machine Learning

We extend Stochastic Gradient Variational Bayes to perform posterior inference for the weights of Stick-Breaking processes. This development allows us to define a Stick-Breaking Variational Autoencoder (SB-VAE), a Bayesian nonparametric version of the variational autoencoder that has a latent representation with stochastic dimensionality. We experimentally demonstrate that the SB-VAE, and a semi-supervised variant, learn highly discriminative latent representations that often outperform the Gaussian VAE's.


Adversarial Feature Learning

arXiv.org Artificial Intelligence

The ability of the Generative Adversarial Networks (GANs) framework to learn generative models mapping from simple latent distributions to arbitrarily complex data distributions has been demonstrated empirically, with compelling results showing that the latent space of such generators captures semantic variation in the data distribution. Intuitively, models trained to predict these semantic latent representations given data may serve as useful feature representations for auxiliary problems where semantics are relevant. However, in their existing form, GANs have no means of learning the inverse mapping -- projecting data back into the latent space. We propose Bidirectional Generative Adversarial Networks (BiGANs) as a means of learning this inverse mapping, and demonstrate that the resulting learned feature representation is useful for auxiliary supervised discrimination tasks, competitive with contemporary approaches to unsupervised and self-supervised feature learning.


Meet Ray, the Real-Time Machine-Learning Replacement for Spark

#artificialintelligence

Researchers at UC Berkeley's RISELab have developed a new distributed framework designed to enable Python-based machine learning and deep learning workloads to execute in real-time with MPI-like power and granularity. Called Ray, the framework is ostensibly a replacement for Spark, which is seen as too slow for some real-world AI applications, and should be ready for production use in less than a year. Ray is one of the first technologies to emerge from RISELab, the research group at Berkeley that followed highly successful AMPLab, which generated a host of compelling distributed technologies that have impacted the field of high performance and enterprise computing alike, including Spark, Mesos, Tachyon, and others. One of the advisors for the old AMPLab and the current RISELab, Computer Science Professor Michael Jordan, discussed the core principles and drivers behind Ray during the recent Strata Hadoop World conference in San Jose, California. "Spark was developed because my students were complaining about Hadoop," Jordan said during a keynote address on March 16.


Adapting ideas from neuroscience for AI

#artificialintelligence

Sign-up to download the forthcoming report: "Artificial Intelligence: Teaching Machines to Think Like People," by Jack Clark. This interview is one in a series of interviews that will be featured in the report. A better understanding of the reasons why neurons spike could lead to smart AI systems that can store more information more efficiently, according to Geoff Hinton, who is often referred to as the "godfather" of deep learning. Geoff Hinton is an emeritus distinguished professor at the University of Toronto and an engineering fellow at Google. He is one of the pioneers of neural networks, and was part of the small group of academics that nursed the technology through a period of tepid interest, funding, and development.


Google Has Given DeepMind a Memory, The AI Will not Make the Same Mistakes Again

#artificialintelligence

Google's DeepMind has been playing video games on the Atari since 2014, and it got pretty good too, beating human scores. The problem was, it couldn't remember how it did it. So, every time a new Atari game was introduced, a new neural network was created, but in doing this, the AI could never benefit from its own learned experiences. However, a group of researchers from DeepMind in collaboration with those at Imperial College London has been busy creating an algorithm that could change all that. The new algorithm allows the AI to learn, retain, and then reuse the knowledge that it learns.


Personalized Aesthetics: Recording the Visual Mind using Machine Learning Parallel Forall

#artificialintelligence

Visual aesthetics are very personal, often subconscious, and hard to express. In a world with an overload of photographic content, a lot of time and effort is spent manually curating photographs, and it's often hard to separate the good images from the visual noise. The question we put forward at EyeEm is: can a machine learn personalized aesthetics embodied in a set of chosen photos, and recreate them in a different set? The incapacity to name is a good symptom of disturbance. Does this photograph draw your attention?


09: Gary Marcus -- Making AI More Human

#artificialintelligence

AMLG: Gary I'm super excited to have you today, thanks for coming on the show. We first met a few years ago in New York when I was running a tech meetup, the Singularity society, and you kindly came and spoke. You've been a professor of psychology at NYU for many years where your work has focused on language, biology, and the human mind. You've spent decades studying how children learn, and then in 2015 you founded this startup called Geometric Intelligence, focused on mining cognitive psychology for insights into building better machine learning techniques. Just this past December you were acquired by Uber to run their newly founded AI labs -- congratulations on that exit. So your algorithms offer an alternative approach to what is now a very popular branch of machine learning, called deep learning. Let's talk about deep learning -- it's a sexy buzzword which is thrown into about every startup pitch I see these days, and many corporate presentations, so I'm sure listeners have heard the term. What it really is is a rebranding of an old technique of using neural nets, which dates back to the 50s. Neural nets basically mimic the human neocortex, and by feeding in massive amounts, gigabytes of data and using tons of computational power, the algorithms are able to recognize patterns. Part of the reason why this technique is back in vogue is the combination of increasingly powerful computers combined with the massive training datasets that companies are building up. So there's been a flurry of activity, and the Googles and Facebooks of the world are throwing resources at the technique. As just one example, Facebook, using the over 400 billion photos people have uploaded, has built something called DeepFace, an image recognition tool that's now better than humans at recognizing whether two different images are of the same person. Gary you are well known as a critic of this technique, you've said that it's over-hyped. That there's some low hanging fruit that deep learning's good at -- specific narrow tasks like perception and categorization, and maybe beating humans at chess, but you felt that this deep learning mania was taking the field of AI in the wrong direction, that we're not making progress on cognition and strong AI. Or as you've put it, "we wanted Rosie the robot, and instead we got the roomba."