90 91 92 93 94 95 0.5 0.6 0.7 0.8 0.9 Accuracy Sparsity Block-aware Original Dense (a) WideResNet22-2 92.5 93 93.5 94 94.5 95 95.5 0.5 0.6 0.7 0.8 0.9 Sparsity Block-aware Original Dense
–Neural Information Processing Systems
Shuffled-block sparse training effectively reduces the execution time of these layers at different sparsities, achieving overall 1.46x to 5.02x Figure 1 shows similar speedups for the three models on CIFAR100 dataset. Figure 14 shows the accuracy of shuffled-block dynamic sparse training with and without our block-aware drop criterion for WideResNet22-2, ResNet18, and VGG16 on CIFAR10 dataset. Figure 1 shows a similar pattern for the three models on CIFAR100 dataset.
Neural Information Processing Systems
Aug-19-2025, 21:23:48 GMT
- Technology: