TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters

Open in new window