CUE-Net: Violence Detection Video Analytics with Spatial Cropping, Enhanced UniformerV2 and Modified Efficient Additive Attention
Senadeera, Damith Chamalke, Yang, Xiaoyun, Kollias, Dimitrios, Slabaugh, Gregory
–arXiv.org Artificial Intelligence
To respond to the challenge of efficient, automated violence detection from video, effective computer vision In this paper we introduce CUE-Net, a novel architecture methods are required. Deep learning techniques such designed for automated violence detection in video as Convolutional Neural Networks (CNNs) and more recently surveillance. As surveillance systems become more prevalent Transformer-based architectures have shown a great due to technological advances and decreasing costs, promise in solving computer vision related automated violence the challenge of efficiently monitoring vast amounts of detection [21, 22, 31]. The success of violence detection video data has intensified. CUE-Net addresses this challenge is highly dependent on the objects and people present by combining spatial Cropping with an enhanced version in the captured videos [22, 31]. Detection is difficult when of the UniformerV2 architecture, integrating convolutional the relevant features of the violent incidents are not captured and self-attention mechanisms alongside a novel properly, for example when the people involved in the Modified Efficient Additive Attention mechanism (which reduces violent incident are far away and occupy only a small part the quadratic time complexity of self-attention) to of the frame, as seen in one of the example videos from effectively and efficiently identify violent activities.
arXiv.org Artificial Intelligence
Apr-27-2024