Foundation Models and Transformers for Anomaly Detection: A Survey
Ammar, Mouïn Ben, Mendoza, Arturo, Belkhir, Nacim, Manzanera, Antoine, Franchi, Gianni
–arXiv.org Artificial Intelligence
In line with the development of deep learning, this survey examines the trans-formative role of T ransformers and foundation models in advancing visual anomaly detection (VAD). We explore how these architectures, with their global receptive fields and adaptability, address challenges such as long-range dependency modeling, contextual modeling and data scarcity . The survey categorizes VAD methods into reconstruction-based, feature-based and zero/few-shot approaches, highlighting the paradigm shift brought about by foundation models. By integrating attention mechanisms and leveraging large-scale pre-training, T ransformers and foundation models enable more robust, interpretable, and scalable anomaly detection solutions. This work provides a comprehensive review of state-of-the-art techniques, their strengths, limitations, and emerging trends in leveraging these architectures for VAD.
arXiv.org Artificial Intelligence
Jul-23-2025
- Country:
- Europe > Switzerland (0.28)
- Genre:
- Overview (1.00)
- Research Report > Promising Solution (0.65)
- Industry:
- Health & Medicine
- Diagnostic Medicine (0.46)
- Therapeutic Area (0.45)
- Health & Medicine
- Technology: