Goto

Collaborating Authors

 Deep Learning


Unifying Count-Based Exploration and Intrinsic Motivation

arXiv.org Machine Learning

We consider an agent's uncertainty about its environment and the problem of generalizing this uncertainty across observations. Specifically, we focus on the problem of exploration in non-tabular reinforcement learning. Drawing inspiration from the intrinsic motivation literature, we use density models to measure uncertainty, and propose a novel algorithm for deriving a pseudo-count from an arbitrary density model. This technique enables us to generalize count-based exploration algorithms to the non-tabular case. We apply our ideas to Atari 2600 games, providing sensible pseudo-counts from raw pixels. We transform these pseudo-counts into intrinsic rewards and obtain significantly improved exploration in a number of hard games, including the infamously difficult Montezuma's Revenge.


Recurrent Neural Networks for Multivariate Time Series with Missing Values

arXiv.org Machine Learning

Multivariate time series data in practical applications, such as health care, geoscience, and biology, are characterized by a variety of missing values. In time series prediction and other related tasks, it has been noted that missing values and their missing patterns are often correlated with the target labels, a.k.a., informative missingness. There is very limited work on exploiting the missing patterns for effective imputation and improving prediction performance. In this paper, we develop novel deep learning models, namely GRU-D, as one of the early attempts. GRU-D is based on Gated Recurrent Unit (GRU), a state-of-the-art recurrent neural network. It takes two representations of missing patterns, i.e., masking and time interval, and effectively incorporates them into a deep model architecture so that it not only captures the long-term temporal dependencies in time series, but also utilizes the missing patterns to achieve better prediction results. Experiments of time series classification tasks on real-world clinical datasets (MIMIC-III, PhysioNet) and synthetic datasets demonstrate that our models achieve state-of-the-art performance and provides useful insights for better understanding and utilization of missing values in time series analysis.


Optimal Binary Autoencoding with Pairwise Correlations

arXiv.org Machine Learning

We formulate learning of a binary autoencoder as a biconvex optimization problem which learns from the pairwise correlations between encoded and decoded bits. Among all possible algorithms that use this information, ours finds the autoencoder that reconstructs its inputs with worst-case optimal loss. The optimal decoder is a single layer of artificial neurons, emerging entirely from the minimax loss minimization, and with weights learned by convex optimization. All this is reflected in competitive experimental results, demonstrating that binary autoencoding can be done efficiently by conveying information in pairwise correlations in an optimal fashion.


Importance Weighted Autoencoders

arXiv.org Machine Learning

The variational autoencoder (VAE; Kingma, Welling (2014)) is a recently proposed generative model pairing a top-down generative network with a bottom-up recognition network which approximates posterior inference. It typically makes strong assumptions about posterior inference, for instance that the posterior distribution is approximately factorial, and that its parameters can be approximated with nonlinear regression from the observations. As we show empirically, the VAE objective can lead to overly simplified representations which fail to use the network's entire modeling capacity. We present the importance weighted autoencoder (IWAE), a generative model with the same architecture as the VAE, but which uses a strictly tighter log-likelihood lower bound derived from importance weighting. In the IWAE, the recognition network uses multiple samples to approximate the posterior, giving it increased flexibility to model complex posteriors which do not fit the VAE modeling assumptions. We show empirically that IWAEs learn richer latent space representations than VAEs, leading to improved test log-likelihood on density estimation benchmarks.


Adversarial Ladder Networks

arXiv.org Machine Learning

The use of unsupervised data in addition to supervised data in training discriminative neural networks has improved the performance of this clas- sification scheme. However, the best results were achieved with a training process that is divided in two parts: first an unsupervised pre-training step is done for initializing the weights of the network and after these weights are refined with the use of supervised data. On the other hand adversarial noise has improved the results of clas- sical supervised learning. Recently, a new neural network topology called Ladder Network, where the key idea is based in some properties of hierar- chichal latent variable models, has been proposed as a technique to train a neural network using supervised and unsupervised data at the same time with what is called semi-supervised learning. This technique has reached state of the art classification. In this work we add adversarial noise to the ladder network and get state of the art classification, with several important conclusions on how adversarial noise can help in addition with new possible lines of investi- gation. We also propose an alternative to add adversarial noise to unsu- pervised data.


Why Deep Learning is Radically Different from Machine Learning – Intuition Machine

#artificialintelligence

There is a lot of confusion these days about Artificial Intelligence (AI), Machine Learning (ML) and Deep Learning (DL). There certainly is a massive uptick of articles about AI being a competitive game changer and that enterprises should begin to seriously explore the opportunities. The distinction between AI, ML and DL are very clear to practitioners in these fields. AI is the all encompassing umbrella that covers everything from Good Old Fashion AI (GOFAI) all the way to connectionist architectures like Deep Learning. ML is a sub-field of AI that covers anything that has to do with the study of learning algorithms by training with data. There are a whole swaths (not swatches) of techniques that have been developed over the years like Linear Regression, K-means, Decision Trees, Random Forest, PCA, SVM and finally Artificial Neural Networks (ANN).


The Important Technology Ever is Here - CML News

#artificialintelligence

Written by Ophir Gottlieb and Jason Hitchings PREFACE It's coming. It will impact almost everything: communication will be revolutionized, entertainment will expand, business structures will be transformed. It is one of the revolutionary themes, one of the fundamental shifts coming in the very near future that will change how we live, work, and play. " This is a technology whose consumer base looks increasingly like [all of] humanity. Source: Dave Thier, Forbes Facebook (NASDAQ:FB), Alphabet (NASDAQ:GOOGL), Amazon (NASDAQ:AMZN), Apple (NASDAQ:AAPL), Microsoft (NASDAQ:MSFT), International Business Machine (NYSE:IBM) and China's Baidu (NASDAQ:BIDU) are already in. Google acquired deep learning research group DeepMind for half a billion dollars in 2014. IBM acquired AlchemyAPI last year and Apple made two acquisitions in just four days. The next "big thing" is Virtual Reality --it's larger than anyone is forecasting, and the likely victor will not be any of the companies we just listed above. This is the opportunity so many investors say they welcome -- say they search for. The opportunity to find the "Next Apple," or the "next Google." Friends, it's coming right now, and it lies in the depths of technology's core. " Recent breakthroughs in GPU-accelerated deep-learning techniques have made it possible to reach exceptional improvements in pattern recognition.


Deep Learning ( Part 1 )

#artificialintelligence

Each day we hear about some machine learning innovation which blows our mind and enthralls us. A lot of these innovations use deep learning concepts. In this post, I will try to jot down a few useful basics. Let's start with neural networks (say NN). An NN can outperform other traditional classification algorithms like Logistic regression, naive bayes etc. in complex tasks which involves large amount of variables and data. However, since NN uses nodes in layers to train and detect pattern, the number of nodes required increases exponentially for too complex tasks.


Why Lawsuits Will Become A Thing Of The Past With This New Technology - 52 Insights

#artificialintelligence

The issue of artificial intelligence in law is becoming an increasingly relevant subject. Many legal companies are already using advanced technology to take care of routine work, and now it seems AI is set to deal with legal issues before they ever have a chance to make it to court. Tech company Intraspexion claim to have developed a deep learning system that can alert companies in advance to any risk of having a legal case brought against them. With the average lawsuit costing a company in the US $350,000, it's easy to see the appeal of'preventative law' technology. CEO Nick Brestoff explains that these risks are detectable in company emails and other internal communications that no human would be able to spot, effectively eradicating any vulnerability of companies being served with painfully expensive lawsuits.