b6af2c9703f203a2794be03d443af2e3-Paper.pdf

Neural Information Processing Systems 

In this work, we combine these observations to assess whether such trainable, transferrable subnetworks exist in pre-trained BERT models. For a range of downstream tasks, we indeed find matching subnetworks at 40% to 90% sparsity.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found