Goto

Collaborating Authors

 stylegan



StyleGAN knows Normal, Depth, Albedo, and More

Neural Information Processing Systems

Intrinsic images, in the original sense, are image-like maps of scene properties like depth, normal, albedo, or shading. This paper demonstrates that StyleGAN can easily be induced to produce intrinsic images.


HairFastGAN: Realistic and Robust Hair Transfer with a Fast Encoder-Based Approach

Neural Information Processing Systems

Our paper addresses the complex task of transferring a hairstyle from a reference image to an input photo for virtual hair try-on. This task is challenging due to the need to adapt to various photo poses, the sensitivity of hairstyles, and the lack of objective metrics. The current state of the art hairstyle transfer methods use an optimization process for different parts of the approach, making them inexcusably slow. At the same time, faster encoder-based models are of very low quality because they either operate in StyleGAN's W+ space or use other low-dimensional image generators. Additionally, both approaches have a problem with hairstyle transfer when the source pose is very different from the target pose, because they either don't consider the pose at all or deal with it inefficiently. In our paper, we present the HairFast model, which uniquely solves these problems and achieves high resolution, near real-time performance, and superior reconstruction compared to optimization problem-based methods. Our solution includes a new architecture operating in the FS latent space of StyleGAN, an enhanced inpainting approach, and improved encoders for better alignment, color transfer, and a new encoder for post-processing. The effectiveness of our approach is demonstrated on realism metrics after random hairstyle transfer and reconstruction when the original hairstyle is transferred. In the most difficult scenario of transferring both shape and color of a hairstyle from different images, our method performs in less than a second on the Nvidia V100.


HairDiffusion: Vivid Multi-Colored Hair Editing via Latent Diffusion

Neural Information Processing Systems

Hair editing is a critical image synthesis task that aims to edit hair color and hairstyle using text descriptions or reference images, while preserving irrelevant attributes (e.g., identity, background, cloth). Many existing methods are based on StyleGAN to address this task. However, due to the limited spatial distribution of StyleGAN, it struggles with multiple hair color editing and facial preservation. Considering the advancements in diffusion models, we utilize Latent Diffusion Models (LDMs) for hairstyle editing. Our approach introduces Multi-stage Hairstyle Blend (MHB), effectively separating control of hair color and hairstyle in diffusion latent space. Additionally, we train a warping module to align the hair color with the target region. To further enhance multi-color hairstyle editing, we fine-tuned a CLIP model using a multi-color hairstyle dataset. Our method not only tackles the complexity of multi-color hairstyles but also addresses the challenge of preserving original colors during diffusion editing. Extensive experiments showcase the superiority of our method in editing multi-color hairstyles while preserving facial attributes given textual descriptions and reference images.



A Experiment Details on Representation

Neural Information Processing Systems

We follow the same optimization step of SimSiam [11]. ResNet-18 [22] with the last fc layer removed. We use the same projection and prediction networks in SimSiam. 's input and output is After pretraining, the representation encoder f is finetuned for downstream tasks. We use the symmetrized cosine similarity loss from SimSiam.



A More Analyses A.1 Evaluation of Whitebox and Blackbox Attacks at FMR = 10

Neural Information Processing Systems

Table 7 and Table 8 of this appendix report the evaluation of attacks with whitebox and blackbox knowledge, respectively, of the system from which the template is leaked (i.e., Table 7: Evaluation of attacks with whitebox knowledge of the system from which the template is leaked (i.e., It is noteworthy that generally, in training GANs (even in conditional GANs) a noise (e.g., from Gaussian distribution) is used in The samples of noise in the input help the generator to learn the distribution of the output space, and therefore help the generator network to generate outputs from the same distribution of real data. However, our method can also be used with other face generator networks. Let us consider the complete pipeline of our problem formulation as depicted in Figure 2 of the paper. During inference (i.e., attacking the target FR system), however, the generated high-resolution face Mitigation of such Attacks This paper demonstrates an important privacy and security threat to the state-of-the-art unprotected face recognition systems. Council, 2016], put legal obligations to protect biometric data as sensitive information. We build face recognition pipelines using Bob [Anjos et al., 2012, 2017] toolbox We have also cited the corresponding paper for each dataset.