Supplementary Material for " Deep Learning with Label Differential Privacy " A Missing Proofs A.1 Proof of Lemma 1 Proof of Lemma 1

Neural Information Processing Systems 

RRTop-k is " -DP as desired. The training set contains 60,000 examples and the test set contains 10,000. On MNIST, Fashion MNIST, and KMNIST, we train the models with mini-batch SGD with batch size 265 and momentum 0.9. On CIFAR-10, we use batch size 512 and momentum 0.9, and train for 200 epochs. The learning rate is scheduled according to the widely used piecewise constant with linear rampup scheme.