Towards Visual Discrimination and Reasoning of Real-World Physical Dynamics: Physics-Grounded Anomaly Detection

Li, Wenqiao, Gu, Yao, Chen, Xintao, Xu, Xiaohao, Hu, Ming, Huang, Xiaonan, Wu, Yingna

arXiv.org Artificial Intelligence 

The dataset includes more than 6400 videos across 22 real-world object categories, interacting with robot arms and motors, and exhibits Humans detect real-world object anomalies by perceiving, 47 types of anomalies. Anomaly detection in Phys-AD requires interacting, and reasoning based on object-conditioned physical visual reasoning, combining both physical knowledge knowledge. The long-term goal of Industrial Anomaly and video content to determine object abnormality. We benchmark Detection (IAD) is to enable machines to autonomously replicate state-of-the-art anomaly detection methods under three this skill. However, current IAD algorithms are largely settings: unsupervised AD, weakly-supervised AD, and videounderstanding developed and tested on static, semantically simple datasets, AD, highlighting their limitations in handling which diverge from real-world scenarios where physical physics-grounded anomalies. Additionally, we introduce the understanding and reasoning are essential. To bridge this Physics Anomaly Explanation (PAEval) metric, designed to gap, we introduce the Physics Anomaly Detection (Phys-AD) assess the ability of visual-language foundation models to not dataset, the first large-scale, real-world, physics-grounded only detect anomalies but also provide accurate explanations video dataset for industrial anomaly detection.