FixyNN: Efficient Hardware for Mobile Computer Vision via Transfer Learning

Whatmough, Paul N., Zhou, Chuteng, Hansen, Patrick, Venkataramanaiah, Shreyas Kolala, Seo, Jae-sun, Mattina, Matthew

arXiv.org Machine Learning 

The computational demands of computer vision tasks based on state-of-the-art Convolutional Neural Network (CNN) image classification far exceed the energy budgets of mobile devices. This paper proposes FixyNN, which consists of a fixed-weight feature extractor that generates ubiquitous CNN features, and a conventional programmable CNN accelerator which processes a dataset-specific CNN. Image classification models for FixyNN are trained end-to-end via transfer learning, with the common feature extractor representing the transfered part, and the programmable part being learnt on the target dataset. Experimental results demonstrate FixyNN hardware can achieve very high energy efficiencies up to 26.6 TOPS/W (4.81 better than iso-area programmable accelerator). Over a suite of six datasets we trained models via transfer learning with an accuracy loss of 1% resulting in up to 11.2 TOPS/W - nearly 2 more efficient than a conventional programmable CNN accelerator of the same area. Mobile devices exhibit Figure 1: FixyNN proposes to split a deep CNN into two constraints in the energy and silicon area that can be parts, which are implemented in hardware using a (shared) allocated to CV tasks, which limits the adoption of CNNs fixed-weight feature extractor (FFE) hardware accelerator at high resolution and frame-rate (e.g. MobileNetV1 similar accuracy to VGG (top-5 ImageNet highlights the performance and power efficiency advantage 89.9% vs. 92.7%), The second trend is the emergence of specialized to buffering data in fixed-weight layers and our tool hardware accelerators tailored specifically to CNN flow for automatically generated hardware.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found