|, which is constant for all t. Define the total disagreement error as φ (z

Neural Information Processing Systems 

The next lemma characterizes the spectral properties of the disagreement matrix, used in Lemma 4. 18 Lemma 7. W is also a stochastic matrix. W are that of I W, each with multiplicity K . Lemma 8. F or every n > 0 we have null null The next Lemma is a well known bound for functions with Lipschitz gradients. The importance is merely technical, and is meant to compress our set of assumption. The MNIST results in Figure 1 used the same settings as above.