5aea56eefab60e06f35016478e21aae6-Supplemental-Conference.pdf

Neural Information Processing Systems 

A.2 DerivationsforSection3.1 We begin with a formal derivation of the formulas in Section 3.1. We remind that we consider a function F(θ) whose parameters can be split inton SI groups: θ = (θ1,...,θn). We solve an optimization problem(1)with projected gradient descent(2). Remark2 The above formulation allegedly lacks the third (divergent) regime. If, conversely, η > 1Pn i=1αi, then at each iteration at least one of the individual ELRs exceeds its convergencethreshold: ηi > 1αi.

Duplicate Docs Excel Report

Similar Docs  Excel Report  more

TitleSimilaritySource
None found