action recognition
Deep Alternative Neural Network: Exploring Contexts as Early as Possible for Action Recognition
Contexts are crucial for action recognition in video. Current methods often mine contexts after extracting hierarchical local features and focus on their high-order encodings. This paper instead explores contexts as early as possible and leverages their evolutions for action recognition. In particular, we introduce a novel architecture called deep alternative neural network (DANN) stacking alternative layers. Each alternative layer consists of a volumetric convolutional layer followed by a recurrent layer.
Unsupervised Learning of View-invariant Action Representations
The recent success in human action recognition with deep learning methods mostly adopt the supervised learning paradigm, which requires significant amount of manually labeled data to achieve good performance. However, label collection is an expensive and time-consuming process. In this work, we propose an unsupervised learning framework, which exploits unlabeled data to learn video representations. Different from previous works in video representation learning, our unsupervised learning task is to predict 3D motion in multiple target views using video representation from a source view. By learning to extrapolate cross-view motions, the representation can capture view-invariant motion dynamics which is discriminative for the action. In addition, we propose a view-adversarial training method to enhance learning of view-invariant features. We demonstrate the effectiveness of the learned representations for action recognition on multiple datasets.
- North America > United States > Illinois (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
MM-Fi: Multi-Modal Non-Intrusive 4D Human Dataset for Versatile Wireless Sensing Jianfei Y ang 1, He Huang 1, Y unjiao Zhou
MA TLAB, as shown in Table 2. To enhance the sensing quality, we have aggregated five adjacent frames into a new frame for use. WiFi CSI data, there are some "-inf" values in some sequences. The "-inf" number comes from the To facilitate the users, we have embedded these processing codes into our dataset tool. When the user loads our WiFi CSI data, these numbers will be handled by linear interpolation. As presented in Section 4.3, we provide the temporal Each sequence is annotated by at least 5 human annotators.
- Information Technology (1.00)
- Government (1.00)
- Law (0.92)
- Leisure & Entertainment > Sports > Soccer (0.68)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
- Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Government (1.00)
- Information Technology (0.93)
- Law (0.67)