Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks

Open in new window