A Baseline algorithms

Neural Information Processing Systems 

The following theorem is a more general version of Theorem 5.1. Assume that Assumptions 1 to 3 hold. Note that the only difference between Theorem B.1 and Theorem 5.1 lies in That is, the "oldest" response used to update By Jensen's inequality and L -smoothness, we have null f In order for the paper to be self-contained, we restate the proof here. The following lemma is slightly modified from Lemma 8 in [18]. By Lemma B.1, we have B Combining Appendix B.3.1 and Appendix B.3.2, we have B.4 Deriving the convergence bound In this subsection, we obtain Theorem B.1 based on the descent lemma.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found