Long-ShortTransformer: EfficientTransformers forLanguageandVision(Appendix) ADetailsofNormComparisons