Wiedemann, Simon
A Deep Learning Method for Simultaneous Denoising and Missing Wedge Reconstruction in Cryogenic Electron Tomography
Wiedemann, Simon, Heckel, Reinhard
Cryogenic electron tomography (cryo-ET) is a technique for imaging biological samples such as viruses, cells, and proteins in 3D. A microscope collects a series of 2D projections of the sample, and the goal is to reconstruct the 3D density of the sample called the tomogram. This is difficult as the 2D projections have a missing wedge of information and are noisy. Tomograms reconstructed with conventional methods, such as filtered back-projection, suffer from the noise, and from artifacts and anisotropic resolution due to the missing wedge of information. To improve the visual quality and resolution of such tomograms, we propose a deep-learning approach for simultaneous denoising and missing wedge reconstruction called DeepDeWedge. DeepDeWedge is based on fitting a neural network to the 2D projections with a self-supervised loss inspired by noise2noise-like methods. The algorithm requires no training or ground truth data. Experiments on synthetic and real cryo-ET data show that DeepDeWedge achieves competitive performance for deep learning-based denoising and missing wedge reconstruction of cryo-ET tomograms.
Quantum Policy Iteration via Amplitude Estimation and Grover Search -- Towards Quantum Advantage for Reinforcement Learning
Wiedemann, Simon, Hein, Daniel, Udluft, Steffen, Mendl, Christian
We present a full implementation and simulation of a novel quantum reinforcement learning method. Our work is a detailed and formal proof of concept for how quantum algorithms can be used to solve reinforcement learning problems and shows that, given access to error-free, efficient quantum realizations of the agent and environment, quantum methods can yield provable improvements over classical Monte-Carlo based methods in terms of sample complexity. Our approach shows in detail how to combine amplitude estimation and Grover search into a policy evaluation and improvement scheme. We first develop quantum policy evaluation (QPE) which is quadratically more efficient compared to an analogous classical Monte Carlo estimation and is based on a quantum mechanical realization of a finite Markov decision process (MDP). Building on QPE, we derive a quantum policy iteration that repeatedly improves an initial policy using Grover search until the optimum is reached. Finally, we present an implementation of our algorithm for a two-armed bandit MDP which we then simulate.
DeepCABAC: A Universal Compression Algorithm for Deep Neural Networks
Wiedemann, Simon, Kirchoffer, Heiner, Matlage, Stefan, Haase, Paul, Marban, Arturo, Marinc, Talmaj, Neumann, David, Nguyen, Tung, Osman, Ahmed, Marpe, Detlev, Schwarz, Heiko, Wiegand, Thomas, Samek, Wojciech
The field of video compression has developed some of the most sophisticated and efficient compression algorithms known in the literature, enabling very high compressibility for little loss of information. Whilst some of these techniques are domain specific, many of their underlying principles are universal in that they can be adapted and applied for compressing different types of data. In this work we present DeepCABAC, a compression algorithm for deep neural networks that is based on one of the state-of-the-art video coding techniques. Concretely, it applies a Context-based Adaptive Binary Arithmetic Coder (CABAC) to the network's parameters, which was originally designed for the H.264/AVC video coding standard and became the state-of-the-art for lossless compression. Moreover, DeepCABAC employs a novel quantization scheme that minimizes the rate-distortion function while simultaneously taking the impact of quantization onto the accuracy of the network into account. Experimental results show that DeepCABAC consistently attains higher compression rates than previously proposed coding techniques for neural network compression. For instance, it is able to compress the VGG16 ImageNet model by x63.6 with no loss of accuracy, thus being able to represent the entire network with merely 8.7MB. The source code for encoding and decoding can be found at https://github.com/fraunhoferhhi/DeepCABAC.
DeepCABAC: Context-adaptive binary arithmetic coding for deep neural network compression
Wiedemann, Simon, Kirchhoffer, Heiner, Matlage, Stefan, Haase, Paul, Marban, Arturo, Marinc, Talmaj, Neumann, David, Osman, Ahmed, Marpe, Detlev, Schwarz, Heiko, Wiegand, Thomas, Samek, Wojciech
From all different proposed We present DeepCABAC, a novel contextadaptive methods, sparsification followed by weight quantization and binary arithmetic coder for compressing entropy coding arguably belong to the set of most popular deep neural networks. It quantizes each weight parameter approaches, since very high compression ratios can be by minimizing a weighted rate-distortion achieved under such paradigm (Han et al., 2015a; Louizos function, which implicitly takes the impact of et al., 2017; Wiedemann et al., 2018a;b). Whereas much of quantization on to the accuracy of the network research has focused on the sparsification part, a substantially into account. Subsequently, it compresses the less amount have focused on improving the later two quantized values into a bitstream representation steps. In fact, most of the proposed (post-sparsity) compression with minimal redundancies. We show that Deep-algorithms come with at least one of the following CABAC is able to reach very high compression caveats: 1) they decouple the quantization procedure from ratios across a wide set of different network architectures the subsequent lossless compression algorithm, 2) ignore and datasets. For instance, we are correlations between the parameters and 3) apply a lossless able to compress by x63.6 the VGG16 ImageNet compression algorithm that produce a bitstream with more model with no loss of accuracy, thus being able to redundancies than principally needed (e.g.
Robust and Communication-Efficient Federated Learning from Non-IID Data
Sattler, Felix, Wiedemann, Simon, Müller, Klaus-Robert, Samek, Wojciech
Federated Learning allows multiple parties to jointly train a deep learning model on their combined data, without any of the participants having to reveal their local data to a centralized server. This form of privacy-preserving collaborative learning however comes at the cost of a significant communication overhead during training. To address this problem, several compression methods have been proposed in the distributed training literature that can reduce the amount of required communication by up to three orders of magnitude. These existing methods however are only of limited utility in the Federated Learning setting, as they either only compress the upstream communication from the clients to the server (leaving the downstream communication uncompressed) or only perform well under idealized conditions such as iid distribution of the client data, which typically can not be found in Federated Learning. In this work, we propose Sparse Ternary Compression (STC), a new compression framework that is specifically designed to meet the requirements of the Federated Learning environment. Our experiments on four different learning tasks demonstrate that STC distinctively outperforms Federated Averaging in common Federated Learning scenarios where clients either a) hold non-iid data, b) use small batch sizes during training, or where c) the number of clients is large and the participation rate in every communication round is low. We furthermore show that even if the clients hold iid data and use medium sized batches for training, STC still behaves pareto-superior to Federated Averaging in the sense that it achieves fixed target accuracies on our benchmarks within both fewer training iterations and a smaller communication budget.
Entropy-Constrained Training of Deep Neural Networks
Wiedemann, Simon, Marban, Arturo, Müller, Klaus-Robert, Samek, Wojciech
Abstract--We propose a general framework for neural network compression that is motivated by the Minimum Description Length (MDL) principle. For that we first derive an expression forthe entropy of a neural network, which measures its complexity explicitly in terms of its bit-size. This objective generalizes many of the compression techniques proposed in the literature, in that pruning or reducing the cardinality of the weight elements of the network can be seen special cases of entropy-minimization techniques. Furthermore, we derive a continuous relaxation of the objective, which allows us to minimize it using gradient based optimization techniques. Finally, we show that we can reach stateof-the-art compressionresults on different network architectures and data sets, e.g. I. INTRODUCTION It is well established that deep neural networks excel on a wide range of machine learning tasks [1].
Compact and Computationally Efficient Representation of Deep Neural Networks
Wiedemann, Simon, Müller, Klaus-Robert, Samek, Wojciech
Dot product operations between matrices are at the heart of almost any field in science and technology. In many cases, they are the component that requires the highest computational resources during execution. For instance, deep neural networks such as VGG-16 require up to 15 giga-operations in order to perform the dot products present in a single forward pass, which results in significant energy consumption and thus limits their use in resource-limited environments, e.g., on embedded devices or smartphones. One common approach to reduce the complexity of the inference is to prune and quantize the weight matrices of the neural network and to efficiently represent them using sparse matrix data structures. However, since there is no guarantee that the weight matrices exhibit significant sparsity after quantization, the sparse format may be suboptimal. In this paper we present new efficient data structures for representing matrices with low entropy statistics and show that these formats are especially suitable for representing neural networks. Alike sparse matrix data structures, these formats exploit the statistical properties of the data in order to reduce the size and execution complexity. Moreover, we show that the proposed data structures can not only be regarded as a generalization of sparse formats, but are also more energy and time efficient under practically relevant assumptions. Finally, we test the storage requirements and execution performance of the proposed formats on compressed neural networks and compare them to dense and sparse representations. We experimentally show that we are able to attain up to x15 compression ratios, x1.7 speed ups and x20 energy savings when we lossless convert state-of-the-art networks such as AlexNet, VGG-16, ResNet152 and DenseNet into the new data structures.
Sparse Binary Compression: Towards Distributed Deep Learning with minimal Communication
Sattler, Felix, Wiedemann, Simon, Müller, Klaus-Robert, Samek, Wojciech
Currently, progressively larger deep neural networks are trained on ever growing data corpora. As this trend is only going to increase in the future, distributed training schemes are becoming increasingly relevant. A major issue in distributed training is the limited communication bandwidth between contributing nodes or prohibitive communication cost in general. These challenges become even more pressing, as the number of computation nodes increases. To counteract this development we propose sparse binary compression (SBC), a compression framework that allows for a drastic reduction of communication cost for distributed training. SBC combines existing techniques of communication delay and gradient sparsification with a novel binarization method and optimal weight update encoding to push compression gains to new limits. By doing so, our method also allows us to smoothly trade-off gradient sparsity and temporal sparsity to adapt to the requirements of the learning task. Our experiments show, that SBC can reduce the upstream communication on a variety of convolutional and recurrent neural network architectures by more than four orders of magnitude without significantly harming the convergence speed in terms of forward-backward passes. For instance, we can train ResNet50 on ImageNet in the same number of iterations to the baseline accuracy, using $\times 3531$ less bits or train it to a $1\%$ lower accuracy using $\times 37208$ less bits. In the latter case, the total upstream communication required is cut from 125 terabytes to 3.35 gigabytes for every participating client.