On Separate Normalization in Self-supervised Transformers Yinkai Wang Department of Computer Science Department of Computer Science Tufts University

Open in new window