When to Use Efficient Self Attention? Profiling Text, Speech and Image Transformer Variants

Open in new window