Contents of the Appendix

Neural Information Processing Systems 

The structure of this section is as follows: Appendix A.1 describes the notations used in the proof; Appendix A.2 introduces the properties of mixing matrix We use upper case, bold letters for matrices and lower case, bold letters for vectors. The algebraic multiplicity of eigenvalue 1 of W is 1. Thus the algebraic multiplicity of 1 is 1.Theorem II (Perron-Frobenius Theorem for W). The mixing W of RelaySGD satisfies 1. (Positivity) ρ (W) = 1 is an eigenvalue of W . 2. (Simplicity) The algebraic multiplicity of 1 is 1. 3. (Dominance) ρ( W) = | λ Statements 1 and 4 follow from Lemma 4. Statement 2 follows from Lemma 6. Statement 3 follows from Lemma 5 and Lemma 6.Lemma 7 (Gelfand's formula) . We characterize the convergence rate of the consensus distance in the following key lemma: Lemma' 1 Then, we apply Gelfand's formula (Lemma 7) with Lemma 8. Given I in Definition G, we have the following estimate null1 π This assumption is used in the proof of Proposition III. The complete proofs for each case are then given in the following Appendix A.4, The next lemma explains their relations.