Action Recognition With Coarse-to-Fine Deep Feature Integration and Asynchronous Fusion
Lin, Weiyao (Shanghai Jiao Tong University) | Zhang, Chongyang (Shanghai Jiao Tong University) | Lu, Ke (University of Chinese Academy of Sciences) | Sheng, Bin (Shanghai Jiao Tong University) | Wu, Jianxin (Nanjing University) | Ni, Bingbing (Shanghai Jiao Tong University) | Liu, Xin (Shenzhen Tencent Computer System Co.) | Xiong, Hongkai (Shanghai Jiao Tong University)
Action recognition is an important yet challenging task in computer vision. In this paper, we propose a novel deep-based framework for action recognition, which improves the recognition accuracy by: 1) deriving more precise features for representing actions, and 2) reducing the asynchrony between different information streams. We first introduce a coarse-to-fine network which extracts shared deep features at different action class granularities and progressively integrates them to obtain a more accurate feature representation for input actions. We further introduce an asynchronous fusion network. It fuses information from different streams by asynchronously integrating stream-wise features at different time points, hence better leveraging the complementary information in different streams. Experimental results on action recognition benchmarks demonstrate that our approach achieves the state-of-the-art performance.
Feb-8-2018
- Technology: