Supplementary material for Space-time Mixing Attention for Video Transformer

Neural Information Processing Systems 

The results of Table 2 clearly show that the two approaches are different.