Learning Sparse Temporal Video Mapping for Action Quality Assessment in Floor Gymnastics
Zahan, Sania, Hassan, Ghulam Mubashar, Mian, Ajmal
–arXiv.org Artificial Intelligence
Abstract--Athlete performance measurement in sports videos requires modeling long sequences since the entire spatio-temporal progression contributes dominantly to the performance. It is crucial to comprehend local discriminative spatial dependencies and global semantics for accurate evaluation. However, existing benchmark datasets mainly incorporate sports where the performance lasts only a few seconds. Consequently, state-ofthe-art sports quality assessment methods specifically focus on spatial structure. Although they achieve high performance in short-term sports, they are unable to model prolonged video sequences and fail to achieve similar performance in long-term sports. To facilitate such analysis, we introduce a new dataset, coined AGF-Olympics, that incorporates artistic gymnastic floor routines. AFG-Olympics provides highly challenging scenarios with extensive background, viewpoint, and scale variations over an extended sample duration of up to 2 minutes. In addition, we propose a discriminative attention module to map the dense feature space into a sparse representation by disentangling complex associations. Extensive experiments indicate that our proposed module provides an effective way to embed long-range spatial and temporal correlation semantics. AQA conceptual workflow: discriminative non-local attention focuses on latent spatio-temporal association.
arXiv.org Artificial Intelligence
Jan-15-2023
- Country:
- Asia > Pakistan (0.04)
- Oceania > Australia
- Western Australia (0.05)
- North America > United States
- Oklahoma (0.04)
- Genre:
- Research Report (0.50)
- Personal (0.46)
- Industry:
- Health & Medicine (1.00)
- Government > Regional Government (0.68)
- Leisure & Entertainment > Sports
- Olympic Games (0.68)
- Technology: