xnor recurrent neural network
The Synthesis of XNOR Recurrent Neural Networks with Stochastic Logic
The emergence of XNOR networks seek to reduce the model size and computational cost of neural networks for their deployment on specialized hardware requiring real-time processes with limited hardware resources. In XNOR networks, both weights and activations are binary, bringing great benefits to specialized hardware by replacing expensive multiplications with simple XNOR operations. Although XNOR convolutional and fully-connected neural networks have been successfully developed during the past few years, there is no XNOR network implementing commonly-used variants of recurrent neural networks such as long short-term memories (LSTMs). The main computational core of LSTMs involves vector-matrix multiplications followed by a set of non-linear functions and element-wise multiplications to obtain the gate activations and state vectors, respectively. Several previous attempts on quantization of LSTMs only focused on quantization of the vector-matrix multiplications in LSTMs while retaining the element-wise multiplications in full precision. In this paper, we propose a method that converts all the multiplications in LSTMs to XNOR operations using stochastic computing. To this end, we introduce a weighted finite-state machine and its synthesis method to approximate the non-linear functions used in LSTMs on stochastic bit streams. Experimental results show that the proposed XNOR LSTMs reduce the computational complexity of their quantized counterparts by a factor of 86x without any sacrifice on latency while achieving a better accuracy across various temporal tasks.
Reviews: The Synthesis of XNOR Recurrent Neural Networks with Stochastic Logic
This paper proposes to use stochastic logic to approximate the activation function of LSTM. Binarization of non-linear units in deep neural networks is an interesting topic that can be relevant for low-resources computing. The main contribution of the paper was the application of stochastic logic to approximate activation functions(e.g. The authors applied the technique to a variant of LSTM model on PTB. Given that the technique is not really tied to the LSTM model, it would be more interesting to evaluate more model architectures(e.g.
Reviews: The Synthesis of XNOR Recurrent Neural Networks with Stochastic Logic
This paper introduces interesting stochastic finite state machine based methods to approximate nonlinear activation functions including hyperbolic tangent and sigmoid functions. A fully binary model of LSTM (both weights and hidden states are binary) is constructed in which XNOR operations are used to perform all the multiplications in the gate and state computations. Empirical results show that the proposed binary LSTM model can dramatically reduce the computational lost while without sacrificing latency or accuracy comparing with existing methods. In the rebuttal, concerns from the reviewers are carefully addressed, e.g., adding an FPGA based implementation. However, some of them are still lack of sufficient details and discussions, in particular, the cost of stochastic computing, and the memory movement cost.
The Synthesis of XNOR Recurrent Neural Networks with Stochastic Logic
The emergence of XNOR networks seek to reduce the model size and computational cost of neural networks for their deployment on specialized hardware requiring real-time processes with limited hardware resources. In XNOR networks, both weights and activations are binary, bringing great benefits to specialized hardware by replacing expensive multiplications with simple XNOR operations. Although XNOR convolutional and fully-connected neural networks have been successfully developed during the past few years, there is no XNOR network implementing commonly-used variants of recurrent neural networks such as long short-term memories (LSTMs). The main computational core of LSTMs involves vector-matrix multiplications followed by a set of non-linear functions and element-wise multiplications to obtain the gate activations and state vectors, respectively. Several previous attempts on quantization of LSTMs only focused on quantization of the vector-matrix multiplications in LSTMs while retaining the element-wise multiplications in full precision.
The Synthesis of XNOR Recurrent Neural Networks with Stochastic Logic
Ardakani, Arash, Ji, Zhengyun, Ardakani, Amir, Gross, Warren
The emergence of XNOR networks seek to reduce the model size and computational cost of neural networks for their deployment on specialized hardware requiring real-time processes with limited hardware resources. In XNOR networks, both weights and activations are binary, bringing great benefits to specialized hardware by replacing expensive multiplications with simple XNOR operations. Although XNOR convolutional and fully-connected neural networks have been successfully developed during the past few years, there is no XNOR network implementing commonly-used variants of recurrent neural networks such as long short-term memories (LSTMs). The main computational core of LSTMs involves vector-matrix multiplications followed by a set of non-linear functions and element-wise multiplications to obtain the gate activations and state vectors, respectively. Several previous attempts on quantization of LSTMs only focused on quantization of the vector-matrix multiplications in LSTMs while retaining the element-wise multiplications in full precision.