Lin, Weisi
CMUA-Watermark: A Cross-Model Universal Adversarial Watermark for Combating Deepfakes
Huang, Hao, Wang, Yongtao, Chen, Zhaoyu, Li, Yuheng, Tang, Zhi, Chu, Wei, Chen, Jingdong, Lin, Weisi, Ma, Kai-Kuang
Malicious application of deepfakes (i.e., technologies can generate target faces or face attributes) has posed a huge threat to our society. The fake multimedia content generated by deepfake models can harm the reputation and even threaten the property of the person who has been impersonated. Fortunately, the adversarial watermark could be used for combating deepfake models, leading them to generate distorted images. The existing methods require an individual training process for every facial image, to generate the adversarial watermark against a specific deepfake model, which are extremely inefficient. To address this problem, we propose a universal adversarial attack method on deepfake models, to generate a Cross-Model Universal Adversarial Watermark (CMUA-Watermark) that can protect thousands of facial images from multiple deepfake models. Specifically, we first propose a cross-model universal attack pipeline by attacking multiple deepfake models and combining gradients from these models iteratively. Then we introduce a batch-based method to alleviate the conflict of adversarial watermarks generated by different facial images. Finally, we design a more reasonable and comprehensive evaluation method for evaluating the effectiveness of the adversarial watermark. Experimental results demonstrate that the proposed CMUA-Watermark can effectively distort the fake facial images generated by deepfake models and successfully protect facial images from deepfakes in real scenes.
Intermediate Deep Feature Compression: the Next Battlefield of Intelligent Sensing
Chen, Zhuo, Lin, Weisi, Wang, Shiqi, Duan, Lingyu, Kot, Alex C.
Abstract--The recent advances of hardware technology have made the intelligent analysis equipped at the front-end with deep learning more prevailing and practical. To better enable the intelligent sensing at the front-end, instead of compressing and transmitting visual signals or the ultimately utilized toplayer deep learning features, we propose to compactly represent and convey the intermediate-layer deep learning features of high generalization capability, to facilitate the collaborating approach between front and cloud ends. This strategy enables a good balance among the computational load, transmission load and the generalization ability for cloud servers when deploying the deep neural networks for large scale cloud based visual analysis. Moreover, the presented strategy also makes the standardization of deep feature coding more feasible and promising, as a series of tasks can simultaneously benefit from the transmitted intermediate layers. We also present the results for evaluation of lossless deep feature compression with four benchmark data compression methods, which provides meaningful investigations and baselines for future research and standardization activities. ECENTLY, deep neural networks (DNNs) have demonstrated the state-of-the-art performance in various computer vision tasks, e.g., image classification [1], [2], [3], [4], image object detection [5], [6], visual tracking [7], visual retrieval [8]. In contrast to the handcrafted features such as Scale-Invariant Feature Transform (SIFT) [9], deep learning based approaches are able to learn representative features directly from the vast amounts of data. For image classification, which is the fundamental task of computer vision, the AlexNet model [1] has achieved 9% better classification accuracy than the previous handcrafted methods in the 2012 ImageNet competition [10], which provides a large scale training dataset with 1.2 million images and one thousand categories. Inspired by the fantastic progress of AlexNet, DNN models continue to be the undisputed leaders in the competition of ImageNet. In particular, both VGGNet [2] and GoogLeNet [11] announced promising performance in the ILSVRC 2014 classification challenge, which demonstrated that deeper and wider architectures can bring great benefits in learning better representations via large scale datasets. In 2016, He et al. also proposed residual blocks to enable very deep learning structure [3]. With the advances of network infrastructure, cloud-based applications are springing up in recent years. In particular, the front-end devices acquire information from users or the physical world, which are subsequently transmitted to the cloud end (i.e., data center) for further process and analyses.