Goto

Collaborating Authors

 convnet


Spatiotemporal Residual Networks for Video Action Recognition

Neural Information Processing Systems

Two-stream Convolutional Networks (ConvNets) have shown strong performance for human action recognition in videos. Recently, Residual Networks (ResNets) have arisen as a new technique to train extremely deep architectures. In this paper, we introduce spatiotemporal ResNets as a combination of these two approaches.



Appendix 367 A Implementation Details

Neural Information Processing Systems

W e are also committed to releasing the code. Implementation details for Stage 2. Our implementation strictly follows the previous work that also In this section, we briefly introduce our tasks. It requires the robot hand to open the door on the table. It requires the robot hand to orient the pen to the target orientation. It requires the robot hand to place the object on the table into the mug. We present the success rates of our six task categories as in Table 1.


Deep Neural Networks with Box Convolutions

Egor Burkov, Victor Lempitsky

Neural Information Processing Systems

Due to its ability to integrate information over large boxes, the new layer facilitates long-range propagation of information and leads to the efficient increase ofthe receptivefields ofnetwork units.