facial image
ACFun: Abstract-Concrete Fusion Facial Stylization
Owing to advancements in image synthesis techniques, stylization methodologies for large models have garnered remarkable outcomes. However, when it comes to processing facial images, the outcomes frequently fall short of expectations. Facial stylization is predominantly challenged by two significant hurdles. Firstly, obtaining a large dataset of high-quality stylized images is difficult. The scarcity and diversity of artistic styles make it impractical to compile comprehensive datasets for each style.
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.67)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- Europe > Finland > Northern Ostrobothnia > Oulu (0.05)
- Europe > United Kingdom > Wales > Ceredigion > Aberystwyth (0.04)
- Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
Can Current Detectors Catch Face-to-Voice Deepfake Attacks?
Nguyen, Nguyen Linh Bao, Abuadbba, Alsharif, Moore, Kristen, Wu, Tingmin
The rapid advancement of generative models has enabled the creation of increasingly stealthy synthetic voices, commonly referred to as audio deepfakes. A recent technique, FOICE [USENIX'24], demonstrates a particularly alarming capability: generating a victim's voice from a single facial image, without requiring any voice sample. By exploiting correlations between facial and vocal features, FOICE produces synthetic voices realistic enough to bypass industry-standard authentication systems, including WeChat Voiceprint and Microsoft Azure. This raises serious security concerns, as facial images are far easier for adversaries to obtain than voice samples, dramatically lowering the barrier to large-scale attacks. In this work, we investigate two core research questions: (RQ1) can state-of-the-art audio deepfake detectors reliably detect FOICE-generated speech under clean and noisy conditions, and (RQ2) whether fine-tuning these detectors on FOICE data improves detection without overfitting, thereby preserving robustness to unseen voice generators such as SpeechT5. Our study makes three contributions. First, we present the first systematic evaluation of FOICE detection, showing that leading detectors consistently fail under both standard and noisy conditions. Second, we introduce targeted fine-tuning strategies that capture FOICE-specific artifacts, yielding significant accuracy improvements. Third, we assess generalization after fine-tuning, revealing trade-offs between specialization to FOICE and robustness to unseen synthesis pipelines. These findings expose fundamental weaknesses in today's defenses and motivate new architectures and training protocols for next-generation audio deepfake detection.
- Oceania > Australia > Victoria > Melbourne (0.05)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States (0.04)
FaSDiff: Balancing Perception and Semantics in Face Compression via Stable Diffusion Priors
Zhou, Yimin, Xia, Yichong, Chen, Bin, Hong, Mingyao, Li, Jiawei, Wang, Zhi, Wang, Yaowei
With the increasing deployment of facial image data across a wide range of applications, efficient compression tailored to facial semantics has become critical for both storage and transmission. While recent learning-based face image compression methods have achieved promising results, they often suffer from degraded reconstruction quality at low bit rates. Directly applying diffusion-based generative priors to this task leads to suboptimal performance in downstream machine vision tasks, primarily due to poor preservation of high-frequency details. In this work, we propose FaSDiff (\textbf{Fa}cial Image Compression with a \textbf{S}table \textbf{Diff}usion Prior), a novel diffusion-driven compression framework designed to enhance both visual fidelity and semantic consistency. FaSDiff incorporates a high-frequency-sensitive compressor to capture fine-grained details and generate robust visual prompts for guiding the diffusion model. To address low-frequency degradation, we further introduce a hybrid low-frequency enhancement module that disentangles and preserves semantic structures, enabling stable modulation of the diffusion prior during reconstruction. By jointly optimizing perceptual quality and semantic preservation, FaSDiff effectively balances human visual fidelity and machine vision accuracy. Extensive experiments demonstrate that FaSDiff outperforms state-of-the-art methods in both perceptual metrics and downstream task performance.
- Asia > China > Guangdong Province > Shenzhen (0.05)
- Asia > China > Heilongjiang Province > Harbin (0.04)
AI study gives insights into why super-recognisers excel at identifying faces
Research has suggested super-recognisers look at more areas across a face than typical people. Research has suggested super-recognisers look at more areas across a face than typical people. Research uses eye-tracking data to examine some people's extraordinary recognition ability They have been used in the search for the Salisbury novichok poisoners, finding murder suspects and even spotting sexual predators. Now, research has revealed fresh insights into why super-recognisers are so good at identifying faces. Previous research has suggested people with an extraordinary ability to recognise people look at more areas across a face than typical people.
- North America > United States (0.19)
- Europe > Ukraine (0.07)
- Oceania > Australia (0.05)
- Europe > United Kingdom > England > Dorset > Bournemouth (0.05)
- Government > Regional Government (0.96)
- Leisure & Entertainment > Sports (0.75)
- Health & Medicine (0.71)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.67)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Accurate and Private Diagnosis of Rare Genetic Syndromes from Facial Images with Federated Deep Learning
Ünal, Ali Burak, Baykara, Cem Ata, Krawitz, Peter, Akgün, Mete
Machine learning has shown promise in facial dysmorphology, where characteristic facial features provide diagnostic clues for rare genetic disorders. GestaltMatcher, a leading framework in this field, has demonstrated clinical utility across multiple studies, but its reliance on centralized datasets limits further development, as patient data are siloed across institutions and subject to strict privacy regulations. We introduce a federated GestaltMatcher service based on a cross-silo horizontal federated learning framework, which allows hospitals to collaboratively train a global ensemble feature extractor without sharing patient images. Patient data are mapped into a shared latent space, and a privacy-preserving kernel matrix computation framework enables syndrome inference and discovery while safeguarding confidentiality. New participants can directly benefit from and contribute to the system by adopting the global feature extractor and kernel configuration from previous training rounds. Experiments show that the federated service retains over 90% of centralized performance and remains robust to both varying silo numbers and heterogeneous data distributions.
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Therapeutic Area > Genetic Disease (0.70)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Vision > Face Recognition (0.86)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.82)