Goto

Collaborating Authors

On decision regions of narrow deep neural networks

arXiv.org Machine Learning

We show that for neural network functions that have width less or equal to the input dimension all connected components of decision regions are unbounded. The result holds for continuous and strictly monotonic activation functions as well as for ReLU activation. This complements recent results on approximation capabilities of [Hanin 2017 Approximating] and connectivity of decision regions of [Nguyen 2018 Neural] for such narrow neural networks. Further, we give an example that negatively answers the question posed in [Nguyen 2018 Neural] whether one of their main results still holds for ReLU activation. Our results are illustrated by means of numerical experiments.


DeepDrum: An Adaptive Conditional Neural Network

arXiv.org Machine Learning

Considering music as a sequence of events with multiple complex dependencies, the Long Short-Term Memory (LSTM) architecture has proven very efficient in learning and reproducing musical styles. However, the generation of rhythms requires additional information regarding musical structure and accompanying instruments. In this paper we present DeepDrum, an adaptive Neural Network capable of generating drum rhythms under constraints imposed by Feed-Forward (Conditional) Layers which contain musical parameters along with given instrumentation information (e.g. bass and guitar notes). Results on generated drum sequences are presented indicating that DeepDrum is effective in producing rhythms that resemble the learned style, while at the same time conforming to given constraints that were unknown during the training process.


Uncertainty in Multitask Transfer Learning

arXiv.org Machine Learning

Using variational Bayes neural networks, we develop an algorithm capable of accumulating knowledge into a prior from multiple different tasks. The result is a rich and meaningful prior capable of few-shot learning on new tasks. The posterior can go beyond the mean field approximation and yields good uncertainty on the performed experiments. Analysis on toy tasks shows that it can learn from significantly different tasks while finding similarities among them. Experiments of Mini-Imagenet yields the new state of the art with 74.5% accuracy on 5 shot learning. Finally, we provide experiments showing that other existing methods can fail to perform well in different benchmarks.


Auto Deep Compression by Reinforcement Learning Based Actor-Critic Structure

arXiv.org Machine Learning

Model-based compression is an effective, facilitating, and expanded model of neural network models with limited computing and low power. However, conventional models of compression techniques utilize crafted features [2,3,12] and explore specialized areas for exploration and design of large spaces in terms of size, speed, and accuracy, which usually have returns Less and time is up. This paper will effectively analyze deep auto compression (ADC) and reinforcement learning strength in an effective sample and space design, and improve the compression quality of the model. The results of compression of the advanced model are obtained without any human effort and in a completely automated way. With a 4- fold reduction in FLOP, the accuracy of 2.8% is higher than the manual compression model for VGG-16 in ImageNet.


Constructing Deep Neural Networks by Bayesian Network Structure Learning

arXiv.org Artificial Intelligence

We introduce a principled approach for unsupervised structure learning of deep neural networks. We propose a new interpretation for depth and inter-layer connectivity where conditional independencies in the input distribution are encoded hierarchically in the network structure. Thus, the depth of the network is determined inherently (equal to the maximal order of independence in the input distribution). The proposed method casts the problem of neural network structure learning as a problem of Bayesian network structure learning. Then, instead of directly learning the discriminative structure, it learns a generative graph, constructs its stochastic inverse, and then constructs a discriminative graph. We prove that conditional-dependency relations among the latent variables in the generative graph are preserved in the class-conditional discriminative graph. We demonstrate on image classification benchmarks that the deepest layers (convolutional and dense) of common networks can be replaced by significantly smaller learned structures, while maintaining classification accuracy---state-of-the-art on tested benchmarks. Our structure learning algorithm requires a small computational cost and runs efficiently on a standard desktop CPU.