A Reconfigurable Winograd CNN Accelerator with Nesting Decomposition Algorithm for Computing Convolution with Large Filters
Jiang, Jingbo, Chen, Xizi, Tsui, Chi-Ying
–arXiv.org Artificial Intelligence
Abstract--Recent literature found that convolutional filters into a fractional number field, which is done by neural networks (CNN) with large filters perform well in multiplying the feature maps and filters with some fixed some applications such as image semantic segmentation. These matrices are derived from a Vandermonde matrix, of which the value of Winograd transformation helps to reduce the number of entry numbers grow exponentially with the matrix size. Thus, multiplications in a convolution but suffers from multiplying the data with a large number may make the numerical instability when the convolution filter size gets computation overflow, and dividing the data with a large large. This work proposes a nested Winograd algorithm number makes the computation suffer from quantization error. Compared with the state-of-art OLA-Winograd algorithm, the proposed algorithm Compared with FFT, the Winograd algorithm appears to reduces the multiplications by 1.41 to 3.29 times for be more popular in recent CNN accelerators since it normally computing 5 5 to 9 9 convolutions.
arXiv.org Artificial Intelligence
Feb-25-2021