Towards Data-Agnostic Pruning At Initialization: What Makes a Good Sparse Mask?

Open in new window