On Separate Normalization in Self-supervised Transformers