Intermediate Deep Feature Compression: the Next Battlefield of Intelligent Sensing

Chen, Zhuo, Lin, Weisi, Wang, Shiqi, Duan, Lingyu, Kot, Alex C.

arXiv.org Artificial Intelligence 

Abstract--The recent advances of hardware technology have made the intelligent analysis equipped at the front-end with deep learning more prevailing and practical. To better enable the intelligent sensing at the front-end, instead of compressing and transmitting visual signals or the ultimately utilized toplayer deep learning features, we propose to compactly represent and convey the intermediate-layer deep learning features of high generalization capability, to facilitate the collaborating approach between front and cloud ends. This strategy enables a good balance among the computational load, transmission load and the generalization ability for cloud servers when deploying the deep neural networks for large scale cloud based visual analysis. Moreover, the presented strategy also makes the standardization of deep feature coding more feasible and promising, as a series of tasks can simultaneously benefit from the transmitted intermediate layers. We also present the results for evaluation of lossless deep feature compression with four benchmark data compression methods, which provides meaningful investigations and baselines for future research and standardization activities. ECENTLY, deep neural networks (DNNs) have demonstrated the state-of-the-art performance in various computer vision tasks, e.g., image classification [1], [2], [3], [4], image object detection [5], [6], visual tracking [7], visual retrieval [8]. In contrast to the handcrafted features such as Scale-Invariant Feature Transform (SIFT) [9], deep learning based approaches are able to learn representative features directly from the vast amounts of data. For image classification, which is the fundamental task of computer vision, the AlexNet model [1] has achieved 9% better classification accuracy than the previous handcrafted methods in the 2012 ImageNet competition [10], which provides a large scale training dataset with 1.2 million images and one thousand categories. Inspired by the fantastic progress of AlexNet, DNN models continue to be the undisputed leaders in the competition of ImageNet. In particular, both VGGNet [2] and GoogLeNet [11] announced promising performance in the ILSVRC 2014 classification challenge, which demonstrated that deeper and wider architectures can bring great benefits in learning better representations via large scale datasets. In 2016, He et al. also proposed residual blocks to enable very deep learning structure [3]. With the advances of network infrastructure, cloud-based applications are springing up in recent years. In particular, the front-end devices acquire information from users or the physical world, which are subsequently transmitted to the cloud end (i.e., data center) for further process and analyses.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found