Election Coding for Distributed Learning: Protecting SignSGD against Byzantine Attacks (Supplementary Materials)

Neural Information Processing Systems 

Table A.1 summarizes the required training time to achieve the target test accuracy, for various A.3 Performances for extreme Byzantine attack scenario In Fig. A.2, we compare the performances of E Table A.1: Training time (minutes) to reach the test accuracy for the suggested E We ran experiments for a larger network, ResNet-50, as seen in Fig. A.4a. Bernoulli codes are abbreviated as "Bern. Figure A.3: Impact of the reduced effective redundancy We compared our scheme with full gradient + median (FGM). Experiments for CIFAR-10 dataset on Resnet-18 use the hyperparameters summarized in Table B.2. For the experiments on Resnet-50, the batch size is set to B = 64 . We define notations used for proving main mathematical results.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found