Goto

Collaborating Authors

 resnet-50


649adc59afdef2a8b9e943f94a04b02f-Paper.pdf

Neural Information Processing Systems

But these methods are unable to improve throughput (frames-per-second) on real-life hardware while simultaneously preserving robustness toadversarial perturbations.




Dynamic Sparsity Is Channel-Level Sparsity Learner Lu Yin 1, Gen Li

Neural Information Processing Systems

Sparse training has received an upsurging interest in machine learning due to its tantalizing saving potential for the entire training process as well as inference.




Birder: Communication-Efficient 1-bit Adaptive Optimizer for Practical Distributed DNN Training

Neural Information Processing Systems

Therefore, from a system-level perspective, the design ethos of a system-efficient communication-compression algorithm is that we should guarantee that the compression/decompression of the algorithm is computationally light and takes less time, and it should also be friendly to efficient collective communication primitives.



Appendix [KAKURENBO: Adaptively Hiding Samples in Deep Neural Network Training ] Anonymous Author(s) Affiliation Address email Appendix A. Proof of Lemma 1

Neural Information Processing Systems

Table 1 summarizes the models and datasets used in this work. ImageNet-1K Deng u. a. (2009): We use the subset of the ImageNet dataset containing DeepCAM Kurth u. a. (2018): DeepCAM dataset for image segmentation, which consists of Fractal-3K Kataoka u. a. (2022) A rendered dataset from the Visual Atom method Kataoka We also use the setting in Kataoka u. a. (2022) Table 2 shows the detail of our hyper-parameters. Specifically, We follow the guideline of'TorchVision' to train the ResNet-50 that uses the CosineLR To show the robustness of KAKURENBO, we also train ResNet-50 with different settings, e.g., ResNet-50 (A) setting, we follow the hyper-parameters reported in Goyal u. a. (2017). It is worth noting that KAKURENBO merely hides samples before the input pipeline. In this section, we present an analysis of the factors affecting KAKURENBO's performance, e.g., the The result shows that our method could dynamically hide the samples at each epoch.


KAKURENBO: Adaptively Hiding Samples in Deep Neural Network Training

Neural Information Processing Systems

This paper proposes a method for hiding the least-important samples during the training of deep neural networks to increase efficiency, i.e., to reduce the cost of