SFANet: Spatial-Frequency Attention Network for Deepfake Detection
Ahire, Vrushank, Muley, Aniruddh, Zample, Shivam, Verma, Siddharth, Menon, Pranav, Madan, Surbhi, Dhall, Abhinav
–arXiv.org Artificial Intelligence
Abstract--Detecting manipulated media has now become a pressing issue with the recent rise of deepfakes. Most existing approaches fail to generalize across diverse datasets and generation techniques. We thus propose a novel ensemble framework, combining the strengths of transformer-based architectures, such as Swin Transformers and ViTs, and texture-based methods, to achieve better detection accuracy and robustness. Our method introduces innovative data-splitting, sequential training, frequency splitting, patch-based attention, and face segmentation techniques to handle dataset imbalances, enhance high-impact regions (e.g., eyes and mouth), and improve generalization. Our model achieves state-of-the-art performance when tested on the DFWild-Cup dataset, a diverse subset of eight deepfake datasets. This work demonstrates that hybrid models can effectively address the evolving challenges of deepfake detection, offering a robust solution for real-world applications. The rapid advancement of deep learning and generative models has led to the proliferation of deepfakes. AI-generated images, videos, and audio recordings are becoming increasingly realistic, making it difficult for humans and traditional systems to distinguish between real and manipulated content.
arXiv.org Artificial Intelligence
Oct-7-2025
- Country:
- Africa > Central African Republic
- Ombella-M'Poko > Bimbo (0.04)
- Asia > India
- Punjab (0.04)
- North America > United States (0.04)
- Oceania > Australia
- Africa > Central African Republic
- Genre:
- Overview (0.68)
- Research Report (1.00)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Technology: