Reviews: Spatiotemporal Residual Networks for Video Action Recognition

Jan-20-2025, 09:59:34 GMT–Neural Information Processing Systems

This paper presents a framework that improves two stream networks for video action recognition by extending residual network to combine information from two streams into one single network. It significantly improves over previous state-of-the-art on two popular video action recognition benchmark. The downside of this paper is the limited novelty. There are previous work tried to combine two streams into a single network [1,2], and the temporal convolution is not new either [3]. Although the way to combine two streams is slightly different from previous work, the proposed approach is still pretty straightforward.

spatiotemporal residual network, temporal convolution, video action recognition, (5 more...)

Neural Information Processing Systems

Jan-20-2025, 09:59:34 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Vision (1.00)