How to Accelerate Neural Networks By Exploiting Sparsity

#artificialintelligence 

In one sense Deep Neural Networks (DNNs) are counterintuitive. Despite representing a complex function with thousands and even millions of parameters, a DNN's network structure ensures that this function is a composition of many similar, simpler functions. However, notice that the "atomic operation" in this computation is not the dot product nor the matrix product but what is known in the language of digital signal processing as the multiply-add operation Most processors built to do any kind of digital signal processing include dedicated multiply-accumulate (MAC) units which implement the fused-multiply-add (FMA) operation, where fused simply means that the operation uses only one rounding step. When a DNN model includes thousands or millions of weights, a processor must perform millions of FMA operations to evaluate the model on a single data point. Note however that the FMA operation is moot if one or both of b and c is zero; their product is zero, thus a may remain unchanged.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found