b6af2c9703f203a2794be03d443af2e3-Paper.pdf
–Neural Information Processing Systems
In this work, we combine these observations to assess whether such trainable, transferrable subnetworks exist in pre-trained BERT models. For a range of downstream tasks, we indeed find matching subnetworks at 40% to 90% sparsity.
Neural Information Processing Systems
Feb-9-2026, 23:15:10 GMT