Adversarial Attacks on Black Box Video Classifiers: Leveraging the Power of Geometric Transformations (Supplementary Material)

Neural Information Processing Systems 

We observe that our method outperforms the baseline methods in a statistically significant way. We consider four state-of-the-art video classification models, representing diverse methodologies of learning from videos, i.e., C3D [1], SlowFast [2], TPN [3] and I3D [4], as our black-box victim models to perform adversarial attack. The C3D model applies 3D convolution to learn spatio-temporal features from videos. SlowFast uses a two-pathway architecture where the slow pathway operates at a low frame rate to capture spatial semantics and the fast pathway operates at a high frame rate to capture motion at fine temporal resolution. I3D proposes the Inflated 3DConvNet(I3D) with Inflated 2D filters and pooling kernels of traditional 2DCNNs.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found