Goto

Collaborating Authors

 binary activation



Reviews: Dichotomize and Generalize: PAC-Bayesian Binary Activated Deep Neural Networks

Neural Information Processing Systems

One contribution is a new approach for training neural networks with binary activations. The second contribution is PAC-Bayesian generalization bounds for binary activated neural networks that, when used as the training objective, come very close to test accuracy (i.e. The gap between the training and test performance is also much smaller. I think this is very promising for training more robust networks. The method actually recovers variational Bayesian learning when the coefficient C is fixed, but in contrast to it, this coefficient is learned in a principled way.


Reviews: Dichotomize and Generalize: PAC-Bayesian Binary Activated Deep Neural Networks

Neural Information Processing Systems

This work studies PAC-Bayes bound optimization in the setting of deep neural networks with binary activations. One of the stated contributions of the paper---showing how to optimize despite the binary activations providing no naive derivative---is, in fact, a known technique in the literature on variational inference. This somewhat undermines the impact of the work, though importing these ideas into the PAC-Bayes community is nice. The other contribution is obtaining nonvacuous bounds and here it is impressive to see such tight bounds. I have a few issues to raise with the introduction, which I would like addressed in revisions: First, the authors write: "Although informative, these results upper bound the prediction error of a (stochastic) neural network with perturbed weights, which is not the one used to predict in practice".


Towards Accurate Binary Convolutional Neural Network

Neural Information Processing Systems

We introduce a novel scheme to train binary convolutional neural networks (CNNs) - CNNs with weights and activations constrained to {-1,+1} at run-time. It has been known that using binary weights and activations drastically reduce memory size and accesses, and can replace arithmetic operations with more efficient bitwise operations, leading to much faster test-time inference and lower power consumption.


Network Binarization via Contrastive Learning

arXiv.org Artificial Intelligence

Neural network binarization accelerates deep models by quantizing their weights and activations into 1-bit. However, there is still a huge performance gap between Binary Neural Networks (BNNs) and their full-precision (FP) counterparts. As the quantization error caused by weights binarization has been reduced in earlier works, the activations binarization becomes the major obstacle for further improvement of the accuracy. BNN characterises a unique and interesting structure, where the binary and latent FP activations exist in the same forward pass (i.e., $\text{Binarize}(\mathbf{a}_F) = \mathbf{a}_B$). To mitigate the information degradation caused by the binarization operation from FP to binary activations, we establish a novel contrastive learning framework while training BNNs through the lens of Mutual Information (MI) maximization. MI is introduced as the metric to measure the information shared between binary and FP activations, which assists binarization with contrastive learning. Specifically, the representation ability of the BNNs is greatly strengthened via pulling the positive pairs with binary and FP activations from the same input samples, as well as pushing negative pairs from different samples (the number of negative pairs can be exponentially large). This benefits the downstream tasks, not only classification but also segmentation and depth estimation, etc. The experimental results show that our method can be implemented as a pile-up module on existing state-of-the-art binarization methods and can remarkably improve the performance over them on CIFAR-10/100 and ImageNet, in addition to the great generalization ability on NYUD-v2.


Probabilistic Binary Neural Networks

arXiv.org Machine Learning

Low bit-width weights and activations are an effective way of combating the increasing need for both memory and compute power of Deep Neural Networks. In this work, we present a probabilistic training method for Neural Network with both binary weights and activations, called BLRNet. By embracing stochasticity during training, we circumvent the need to approximate the gradient of non-differentiable functions such as sign(·), while still obtaining a fully Binary Neural Network at test time. Moreover, it allows for anytime ensemble predictions for improved performance and uncertainty estimates by sampling from the weight distribution. Since all operations in a layer of the BLRNet operate on random variables, we introduce stochastic versions of Batch Normalization and max pooling, which transfer well to a deterministic network at test time. We evaluate the BLRNet on multiple standardized benchmarks.


Towards Accurate Binary Convolutional Neural Network

Neural Information Processing Systems

We introduce a novel scheme to train binary convolutional neural networks (CNNs) -- CNNs with weights and activations constrained to \{-1,+1\} at run-time. It has been known that using binary weights and activations drastically reduce memory size and accesses, and can replace arithmetic operations with more efficient bitwise operations, leading to much faster test-time inference and lower power consumption. However, previous works on binarizing CNNs usually result in severe prediction accuracy degradation. In this paper, we address this issue with two major innovations: (1) approximating full-precision weights with the linear combination of multiple binary weight bases; (2) employing multiple binary activations to alleviate information loss. The implementation of the resulting binary CNN, denoted as ABC-Net, is shown to achieve much closer performance to its full-precision counterpart, and even reach the comparable prediction accuracy on ImageNet and forest trail datasets, given adequate binary weight bases and activations.


Towards Accurate Binary Convolutional Neural Network

arXiv.org Machine Learning

We introduce a novel scheme to train binary convolutional neural networks (CNNs) -- CNNs with weights and activations constrained to {-1,+1} at run-time. It has been known that using binary weights and activations drastically reduce memory size and accesses, and can replace arithmetic operations with more efficient bitwise operations, leading to much faster test-time inference and lower power consumption. However, previous works on binarizing CNNs usually result in severe prediction accuracy degradation. In this paper, we address this issue with two major innovations: (1) approximating full-precision weights with the linear combination of multiple binary weight bases; (2) employing multiple binary activations to alleviate information loss. The implementation of the resulting binary CNN, denoted as ABC-Net, is shown to achieve much closer performance to its full-precision counterpart, and even reach the comparable prediction accuracy on ImageNet and forest trail datasets, given adequate binary weight bases and activations.