90 91 92 93 94 95 0.5 0.6 0.7 0.8 0.9 Accuracy Sparsity Block-aware Original Dense (a) WideResNet22-2 92.5 93 93.5 94 94.5 95 95.5 0.5 0.6 0.7 0.8 0.9 Sparsity Block-aware Original Dense

Aug-19-2025, 21:23:48 GMT–Neural Information Processing Systems

Shuffled-block sparse training effectively reduces the execution time of these layers at different sparsities, achieving overall 1.46x to 5.02x Figure 1 shows similar speedups for the three models on CIFAR100 dataset. Figure 14 shows the accuracy of shuffled-block dynamic sparse training with and without our block-aware drop criterion for WideResNet22-2, ResNet18, and VGG16 on CIFAR10 dataset. Figure 1 shows a similar pattern for the three models on CIFAR100 dataset.

artificial intelligence, gradient, machine learning, (13 more...)

Neural Information Processing Systems

Aug-19-2025, 21:23:48 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.52)

Duplicate Docs Excel Report

Title
fa69e968b7319fd42524febd41475fb3-Supplemental-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found