PackQViT: Faster Sub-8-bit Vision Transformers via Full and Packed Quantization on the Mobile Peiyan Dong 1, Lei Lu

Neural Information Processing Systems 

Model quantization is a widely-used technique to optimize the hardware efficiency of deep neural networks.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found