Short-Window Sliding Learning for Real-Time Violence Detection via LLM-based Auto-Labeling

Jung, Seoik, Song, Taekyung, Lee, Yangro, Lee, Sungjun

Nov-17-2025–arXiv.org Artificial Intelligence

Abstract--This paper proposes a Short-Window Sliding Learning framework for real-time violence detection in CCTV footages. Unlike conventional long-video training approaches, the proposed method divides videos into 1-2 second clips and applies Large Language Model (LLM)-based auto-caption labeling to construct fine-grained datasets. Each short clip fully utilizes all frames to preserve temporal continuity, enabling precise recognition of rapid violent events. Experiments demonstrate that the proposed method achieves 95.25% accuracy on RWF-2000 and significantly improves performance on long videos (UCF-Crime: 83.25%), confirming its strong generalization and real-time applicability in intelligent surveillance systems. Recently, video-based violence and abnormal behavior detection has been gaining attention as an essential core technology in fields such as public safety, smart cities, and intelligent surveillance [1].

large language model, machine learning, real time system, (18 more...)

arXiv.org Artificial Intelligence

Nov-17-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.83)

Technology:
- Information Technology
  - Architecture > Real Time Systems (0.86)
  - Artificial Intelligence
    - Natural Language > Large Language Model (0.93)
    - Machine Learning > Neural Networks
      - Deep Learning (0.48)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found