Human Activity Encoding and Recognition Using Low-level Visual Features
Wang, Zheshen (Arizona State University) | Li, Baoxin (Arizona State University)
Automatic recognition of human activities is among the key capabilities of many intelligent systems with vision/perception. Most existing approaches to this problem require sophisticated feature extraction before classification can be performed. This paper presents a novel approach for human action recognition using only simple low-level visual features: motion captured from direct frame differencing. A codebook of key poses is first created from the training data through unsupervised clustering. Videos of actions are then coded as sequences of super-frames, defined as the key poses augmented with discriminative attributes. A weighted-sequence distance is proposed for comparing two super-frame sequences, which is further wrapped as a kernel embedded in a SVM classifier for the final classification. Compared with conventional methods, our approach provides a flexible non-parametric sequential structure with a corresponding distance measure for human action representation and classification without requiring complex feature extraction. The effectiveness of our approach is demonstrated with the widely-used KTH human activity dataset, for which the proposed method outperforms the existing state-of-the-art.
- Country:
- North America > United States
- Arizona (0.04)
- Europe > United Kingdom
- England
- Oxfordshire > Oxford (0.04)
- Cambridgeshire > Cambridge (0.04)
- England
- North America > United States
- Genre:
- Research Report > Promising Solution (0.34)
- Overview > Innovation (0.34)
- Technology: