A Massively Parallel Digital Learning Processor
Graf, Hans P., Cadambi, Srihari, Jakkula, Venkata, Sankaradass, Murugan, Cosatto, Eric, Chakradhar, Srimat, Dourdanovic, Igor
–Neural Information Processing Systems
We present a new, massively parallel architecture for accelerating machine learning algorithms, based on arrays of variable-resolution arithmetic vector processing elements (VPE). Groups of VPEs operate in SIMD (single instruction multiple data) mode, and each group is connected to an independent memory bank. In this way memory bandwidth scales with the number of VPE, and the main data flows are local, keeping power dissipation low. With 256 VPEs, implemented on two FPGA (field programmable gate array) chips, we obtain a sustained speed of 19 GMACS (billion multiply-accumulate per sec.) for SVM training, and 86 GMACS for SVM classification. This performance is more than an order of magnitude higher than that of any FPGA implementation reported so far.
Neural Information Processing Systems
Feb-15-2020, 01:56:09 GMT