Goto

Collaborating Authors

 target face



Adv Attribute Inconspicuous and Transferable Adversarial Attack on Face Recognition Supplementary Material

Neural Information Processing Systems

StyleGAN [1] and the proposed Adv-Attribute attack. During training, the proposed important-aware attribute selection can choose the optimal attribute for the different pairs of target faces and source faces. When attacking the same target face, diverse source faces choose different attributes in each step. Lemma 1. Suppose the overall training loss Do the main claims made in the abstract and introduction accurately reflect the paper's If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Y es] (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they Did you report error bars (e.g., with respect to the random seed after running experiments multiple times)? Did you include the total amount of compute and the type of resources used (e.g., type Did you include any new assets either in the supplemental material or as a URL? [N/A] Did you discuss whether and how consent was obtained from people whose data you're If you used crowdsourcing or conducted research with human subjects... (a)




Adv Attribute Inconspicuous and Transferable Adversarial Attack on Face Recognition Supplementary Material

Neural Information Processing Systems

StyleGAN [1] and the proposed Adv-Attribute attack. During training, the proposed important-aware attribute selection can choose the optimal attribute for the different pairs of target faces and source faces. When attacking the same target face, diverse source faces choose different attributes in each step. Lemma 1. Suppose the overall training loss Do the main claims made in the abstract and introduction accurately reflect the paper's If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Y es] (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they Did you report error bars (e.g., with respect to the random seed after running experiments multiple times)? Did you include the total amount of compute and the type of resources used (e.g., type Did you include any new assets either in the supplemental material or as a URL? [N/A] Did you discuss whether and how consent was obtained from people whose data you're If you used crowdsourcing or conducted research with human subjects... (a)



Face Deepfakes - A Comprehensive Review

Fernando, Tharindu, Priyasad, Darshana, Sridharan, Sridha, Ross, Arun, Fookes, Clinton

arXiv.org Artificial Intelligence

In recent years, remarkable advancements in deep- fake generation technology have led to unprecedented leaps in its realism and capabilities. Despite these advances, we observe a notable lack of structured and deep analysis deepfake technology. The principal aim of this survey is to contribute a thorough theoretical analysis of state-of-the-art face deepfake generation and detection methods. Furthermore, we provide a coherent and systematic evaluation of the implications of deepfakes on face biometric recognition approaches. In addition, we outline key applications of face deepfake technology, elucidating both positive and negative applications of the technology, provide a detailed discussion regarding the gaps in existing research, and propose key research directions for further investigation.


VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping

Shao, Hao, Wang, Shulun, Zhou, Yang, Song, Guanglu, He, Dailan, Qin, Shuo, Zong, Zhuofan, Ma, Bingqi, Liu, Yu, Li, Hongsheng

arXiv.org Artificial Intelligence

Video face swapping is becoming increasingly popular across various applications, yet existing methods primarily focus on static images and struggle with video face swapping because of temporal consistency and complex scenarios. In this paper, we present the first diffusion-based framework specifically designed for video face swapping. Our approach introduces a novel image-video hybrid training framework that leverages both abundant static image data and temporal video sequences, addressing the inherent limitations of video-only training. The framework incorporates a specially designed diffusion model coupled with a VidFaceVAE that effectively processes both types of data to better maintain temporal coherence of the generated videos. To further disentangle identity and pose features, we construct the Attribute-Identity Disentanglement Triplet (AIDT) Dataset, where each triplet has three face images, with two images sharing the same pose and two sharing the same identity. Enhanced with a comprehensive occlusion augmentation, this dataset also improves robustness against occlusions. Additionally, we integrate 3D reconstruction techniques as input conditioning to our network for handling large pose variations. Extensive experiments demonstrate that our framework achieves superior performance in identity preservation, temporal consistency, and visual quality compared to existing methods, while requiring fewer inference steps. Our approach effectively mitigates key challenges in video face swapping, including temporal flickering, identity preservation, and robustness to occlusions and pose variations.


Reviews: Unsupervised Depth Estimation, 3D Face Rotation and Replacement

Neural Information Processing Systems

In particular, the method estimates the depth of the 2D keypoints of the source images using information from both images, and the method estimates the 3D-to-2D affine transform from the source to the target. With this transformation, a traditional keypoint-based face warping (implemented in OpenGL) algorithm and CycleGAN are used to map the source image to the target image. Note that the estimation of the depth and affine transform can either depends on only the 2D keypoints or both the keypoints and images.


End-to-end Face-swapping via Adaptive Latent Representation Learning

Lin, Chenhao, Hu, Pengbin, Shen, Chao, Li, Qian

arXiv.org Artificial Intelligence

Taking full advantage of the excellent performance of StyleGAN, style transfer-based face swapping methods have been extensively investigated recently. However, these studies require separate face segmentation and blending modules for successful face swapping, and the fixed selection of the manipulated latent code in these works is reckless, thus degrading face swapping quality, generalizability, and practicability. This paper proposes a novel and end-to-end integrated framework for high resolution and attribute preservation face swapping via Adaptive Latent Representation Learning. Specifically, we first design a multi-task dual-space face encoder by sharing the underlying feature extraction network to simultaneously complete the facial region perception and face encoding. This encoder enables us to control the face pose and attribute individually, thus enhancing the face swapping quality. Next, we propose an adaptive latent codes swapping module to adaptively learn the mapping between the facial attributes and the latent codes and select effective latent codes for improved retention of facial attributes. Finally, the initial face swapping image generated by StyleGAN2 is blended with the facial region mask generated by our encoder to address the background blur problem. Our framework integrating facial perceiving and blending into the end-to-end training and testing process can achieve high realistic face-swapping on wild faces without segmentation masks. Experimental results demonstrate the superior performance of our approach over state-of-the-art methods.