A Framework Combining 3D CNN and Transformer for Video-Based Behavior Recognition

Open in new window