How Weakly Supervised Learning works(Machine Learning)

Jul-25-2022, 09:36:45 GMT–#artificialintelligence

Abstract: Weakly-supervised audio-visual violence detection aims to distinguish snippets containing multimodal violence events with video-level labels. Many prior works perform audio-visual integration and interaction in an early or intermediate manner, yet overlooking the modality heterogeneousness over the weakly-supervised setting. In this paper, we analyze the modality asynchrony and undifferentiated instances phenomena of the multiple instance learning (MIL) procedure, and further investigate its negative impact on weakly-supervised audio-visual learning. To address these issues, we propose a modality-aware contrastive instance learning with self-distillation (MACIL-SD) strategy. Specifically, we leverage a lightweight two-stream network to generate audio and visual bags, in which unimodal background, violent, and normal instances are clustered into semi-bags in an unsupervised way. Then audio and visual violent semi-bag representations are assembled as positive pairs, and violent semi-bags are combined with background and normal instances in the opposite modality as contrastive negative pairs.

instance-level prediction, robust bag likelihood, weakly supervised learning work, (8 more...)

#artificialintelligence

Jul-25-2022, 09:36:45 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.58)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found