Enabling Detailed Action Recognition Evaluation Through Video Dataset Augmentation

Neural Information Processing Systems 

It is well-known in the video understanding community that human action recognition models suffer from background bias, i.e., over-relying on scene cues in making