A Appendix: Proof of Theorem 1 ⇡ ` (z) =E L

Neural Information Processing Systems 

We first show that the estimate is unbiased. Next, we turn to analyze the variance of the multibatch estimate. For q =3: Let us consider the case when i = s and j 6= t, and the derivation for the case when i 6= s and j = t is analogous.