TwIST: Rigging the Lottery in Transformers with Independent Subnetwork Training

Open in new window