Foundation Models and Transformers for Anomaly Detection: A Survey
Ammar, Mouïn Ben, Mendoza, Arturo, Belkhir, Nacim, Manzanera, Antoine, Franchi, Gianni
–arXiv.org Artificial Intelligence
In line with the development of deep learning, this survey examines the trans-formative role of T ransformers and foundation models in advancing visual anomaly detection (VAD). We explore how these architectures, with their global receptive fields and adaptability, address challenges such as long-range dependency modeling, contextual modeling and data scarcity . The survey categorizes VAD methods into reconstruction-based, feature-based and zero/few-shot approaches, highlighting the paradigm shift brought about by foundation models. By integrating attention mechanisms and leveraging large-scale pre-training, T ransformers and foundation models enable more robust, interpretable, and scalable anomaly detection solutions. This work provides a comprehensive review of state-of-the-art techniques, their strengths, limitations, and emerging trends in leveraging these architectures for VAD.
arXiv.org Artificial Intelligence
Jul-23-2025
- Country:
- Asia > Japan
- Honshū > Kantō > Kanagawa Prefecture (0.04)
- Europe
- France (0.04)
- Switzerland > Zürich
- Zürich (0.14)
- United Kingdom > England
- Greater London > London (0.04)
- North America > Canada
- Alberta > Census Division No. 19 > Saddle Hills County (0.04)
- Asia > Japan
- Genre:
- Overview (1.00)
- Research Report > Promising Solution (0.65)
- Industry:
- Health & Medicine
- Diagnostic Medicine (0.46)
- Therapeutic Area (0.45)
- Health & Medicine
- Technology: