Supplementary File for " Stochastic Gradient Descent in Correlated Settings: A Study on Gaussian Processes "
–Neural Information Processing Systems
The supplementary file is organized as follows: Section 1 restates the assumptions and main theorems on the convergence of parameter iterates and the full gradient; Section 2 is devoted to the proofs of the two main theorems, while Section 3 includes the proofs of supporting lemmas; Section 4 includes additional figures from the numerical study. Under Assumptions 1.1 to 1.3, when m > C for some constant C > 0, we have the following results under two corresponding conditions on s First we present the following lemma, showing that the loss function has a property similar from strong convexity. For the first case discussed in Lemma 2.1, define null g (θ ( k 1) (k 1) ( k 1) (k 1) ( k 1) (k 1) Therefore, combining Lemma 2.1, Lemma 2.2 and (7) leads to the following conclusion. Proof of Theorem 2. We start from bounding null Under this case, we can still apply (15) in Lemma 2.3. The following proof of this claim is very similar to the proof of Lemma 5.2 in [2].
Neural Information Processing Systems
Oct-2-2025, 08:53:06 GMT
- Technology: