Two-Stream Convolutional Networks for Action Recognition in Videos
Simonyan, Karen, Zisserman, Andrew
–Neural Information Processing Systems
We investigate architectures of discriminatively trained deep Convolutional Networks (ConvNets) for action recognition in video. The challenge is to capture the complementary information on appearance from still frames and motion between frames. We also aim to generalise the best performing hand-crafted features within a data-driven learning framework. First, we propose a two-stream ConvNet architecture which incorporates spatial and temporal networks. Second, we demonstrate that a ConvNet trained on multi-frame dense optical flow is able to achieve very good performance in spite of limited training data.
Neural Information Processing Systems
Feb-14-2020, 06:11:44 GMT
- Technology:
- Information Technology > Artificial Intelligence
- Vision (1.00)
- Machine Learning (0.79)
- Information Technology > Artificial Intelligence