Reviews: Trajectory Convolution for Action Recognition
–Neural Information Processing Systems
UPDATE: Thank you to the authors for addressing my concerns. With the new version of Table 1, and the clarification of ResNet-18 vs BN-Inception, my concern about the experimentation has been addressed -- there does seem to be a clear improvement over classical 3D convolution. I have adjusted my score upwards, accordingly. Recently, a number of new neural network models for action recognition in video have been introduced that employ 3d (spacetime) convolutions to show significant gains on large benchmark datasets. When there is significant human or camera motion, convolutions through time at a fixed (x,y) image coordinate seem suboptimal since the person/object is almost certainly at a different position in subsequent frames.
Neural Information Processing Systems
Oct-7-2024, 16:45:06 GMT
- Technology: