Deep Learning
Unsupervised Learning of Disentangled Representations from Video
Denton, Emily, Birodkar, Vighnesh
We present a new model DrNET that learns disentangled image representations from video. Our approach leverages the temporal coherence of video and a novel adversarial loss to learn a representation that factorizes each frame into a stationary part and a temporally varying component. The disentangled representation can be used for a range of tasks. For example, applying a standard LSTM to the time-vary components enables prediction of future frames. We evaluate our approach on a range of synthetic and real videos, demonstrating the ability to coherently generate hundreds of steps into the future.
Morphological Error Detection in 3D Segmentations
Rolnick, David, Meirovitch, Yaron, Parag, Toufiq, Pfister, Hanspeter, Jain, Viren, Lichtman, Jeff W., Boyden, Edward S., Shavit, Nir
Deep learning algorithms for connectomics rely upon localized classification, rather than overall morphology. This leads to a high incidence of erroneously merged objects. Humans, by contrast, can easily detect such errors by acquiring intuition for the correct morphology of objects. Biological neurons have complicated and variable shapes, which are challenging to learn, and merge errors take a multitude of different forms. We present an algorithm, MergeNet, that shows 3D ConvNets can, in fact, detect merge errors from high-level neuronal morphology. MergeNet follows unsupervised training and operates across datasets. We demonstrate the performance of MergeNet both on a variety of connectomics data and on a dataset created from merged MNIST images.
Surface Networks
Kostrikov, Ilya, Bruna, Joan, Panozzo, Daniele, Zorin, Denis
We study data-driven representations for three-dimensional triangle meshes, which are one of the prevalent objects used to represent 3D geometry. Recent works have developed models that exploit the intrinsic geometry of manifolds and graphs, namely the Graph Neural Networks (GNNs) and its spectral variants, which learn from the local metric tensor via the Laplacian operator. Despite offering excellent sample complexity and built-in invariances, intrinsic geometry alone is invariant to isometric deformations, making it unsuitable for many applications. To overcome this limitation, we propose several upgrades to GNNs to leverage extrinsic differential geometry properties of three-dimensional surfaces, increasing its modeling power. In particular, we propose to exploit the Dirac operator, whose spectrum detects principal curvature directions --- this is in stark contrast with the classical Laplace operator, which directly measures mean curvature. We coin the resulting model the \emph{Surface Network (SN)}. We demonstrate the efficiency and versatility of SNs on two challenging tasks: temporal prediction of mesh deformations under non-linear dynamics and generative models using a variational autoencoder framework with encoders/decoders given by SNs.
A Multi-Layer K-means Approach for Multi-Sensor Data Pattern Recognition in Multi-Target Localization
Silva, Samuel, Suresh, Rengan, Tao, Feng, Votion, Johnathan, Cao, Yongcan
Multi-target tracking (MTT) is focused on the accurate detection and localization for multiple dynamic targets when measurements from these targets often come from numerous spatially distributed sensors. Obtaining the locations of the targets can be complex when sensors have limited sensing capabilities. Due to the potential applications of MTT, MTT can be dated back to 1960's initially related to aerospace applications [1]. The theoretical advances in MTT, new sensor capabilities, and more computational power have made it possible to apply MTT in numerous applications such as surveillance [2], [3], computer vision [4], [5], network and computer security [6] and sensor network [7]. In general, solving the MTT problem involves three tasks: (i) Extraction - extract target related information from the raw data acquired from the sensors; (ii) Data association - identify each target's corresponding measurements; and, (iii) Estimation - estimate the position of targets via single target tracking techniques (as shown [8]-[10]). Perhaps the most challenging task is to conduct data association because if data associated with each target is determined, it becomes much easier to conduct estimation for each individual target. In this paper, our focus is also on the data association problem. The main objective of this paper is to investigate the applicability of machine learning algorithms for the data association problem and then develop a new multi-layer learning algorithm by leveraging the advantages of different machine learning algorithms.
Recurrent Estimation of Distributions
Oliva, Junier B., Dubey, Kumar Avinava, Poczos, Barnabas, Xing, Eric, Schneider, Jeff
This paper presents the recurrent estimation of distributions (RED) for modeling real-valued data in a semiparametric fashion. RED models make two novel uses of recurrent neural networks (RNNs) for density estimation of general real-valued data. First, RNNs are used to transform input covariates into a latent space to better capture conditional dependencies in inputs. After, an RNN is used to compute the conditional distributions of the latent covariates. The resulting model is efficient to train, compute, and sample from, whilst producing normalized pdfs. The effectiveness of RED is shown via several real-world data experiments. Our results show that RED models achieve a lower held-out negative log-likelihood than other neural network approaches across multiple dataset sizes and dimensionalities. Further context of the efficacy of RED is provided by considering anomaly detection tasks, where we also observe better performance over alternative models.
The Cramer Distance as a Solution to Biased Wasserstein Gradients
Bellemare, Marc G., Danihelka, Ivo, Dabney, Will, Mohamed, Shakir, Lakshminarayanan, Balaji, Hoyer, Stephan, Munos, Rรฉmi
The Wasserstein probability metric has received much attention from the machine learning community. Unlike the Kullback-Leibler divergence, which strictly measures change in probability, the Wasserstein metric reflects the underlying geometry between outcomes. The value of being sensitive to this geometry has been demonstrated, among others, in ordinal regression and generative modelling. In this paper we describe three natural properties of probability divergences that reflect requirements from machine learning: sum invariance, scale sensitivity, and unbiased sample gradients. The Wasserstein metric possesses the first two properties but, unlike the Kullback-Leibler divergence, does not possess the third. We provide empirical evidence suggesting that this is a serious issue in practice. Leveraging insights from probabilistic forecasting we propose an alternative to the Wasserstein metric, the Cram\'er distance. We show that the Cram\'er distance possesses all three desired properties, combining the best of the Wasserstein and Kullback-Leibler divergences. To illustrate the relevance of the Cram\'er distance in practice we design a new algorithm, the Cram\'er Generative Adversarial Network (GAN), and show that it performs significantly better than the related Wasserstein GAN.
Joint auto-encoders: a flexible multi-task learning framework
Meir, Baruch Epstein. Ron, Michaeli, Tomer
The incorporation of prior knowledge into learning is essential in achieving good performance based on small noisy samples. Such knowledge is often incorporated through the availability of related data arising from domains and tasks similar to the one of current interest. Ideally one would like to allow both the data for the current task and for previous related tasks to self-organize the learning system in such a way that commonalities and differences between the tasks are learned in a data-driven fashion. We develop a framework for learning multiple tasks simultaneously, based on sharing features that are common to all tasks, achieved through the use of a modular deep feedforward neural network consisting of shared branches, dealing with the common features of all tasks, and private branches, learning the specific unique aspects of each task. Once an appropriate weight sharing architecture has been established, learning takes place through standard algorithms for feedforward networks, e.g., stochastic gradient descent and its variations. The method deals with domain adaptation and multi-task learning in a unified fashion, and can easily deal with data arising from different types of sources. Numerical experiments demonstrate the effectiveness of learning in domain adaptation and transfer learning setups, and provide evidence for the flexible and task-oriented representations arising in the network.
AlphaGo retires from competitive Go after defeating world number one 3-0
AlphaGo is going out on top. After beating Ke Jie, the world's best player of the ancient Chinese board game Go, for the third time today at the Future of Go Summit in Wuzhen, Google's DeepMind unit announced that it would be the last event match the AI plays. In a statement, DeepMind co-founder and co-CEO Demis Hassabis said the reason was that this week's summit represented "the highest possible pinnacle for AlphaGo as a competitive program." AlphaGo rose to prominence a little over a year ago when it unexpectedly defeated legendary player Lee Se-dol 4-1 in a match held in Seoul. Most computer scientists expected the feat of beating a top Go player with artificial intelligence to be decades away due to the game's complexity and nuance, but with this week's comprehensive defeat of Ke Jie the matter has been settled.
Deep Learning and AI Success Stories - insideBIGDATA
The insideBIGDATA Guide to Deep Learning & Artificial Intelligence is a useful new resource directed toward enterprise thought leaders who wish to gain strategic insights into this exciting area of technology. In this guide, we take a high-level view of AI and deep learning in terms of how it's being used and what technological advances have made it possible. We also explain the difference between AI, machine learning and deep learning, and examine the intersection of AI and HPC. We present the results of a recent insideBIGDATA survey that reflects how well these new technologies are being received. Finally, we take a look at a number of high-profile use case examples showing the effective use of AI in a variety of problem domains.