Reparameterizing Mirror Descent as Gradient Descent

Neural Information Processing Systems 

Forthis, wefirstconsiderthe -trickon(18), inwhichwesetw(t)= w+(t) w (t) where log w+(t)= rwL(w(t)), log w (t)=+ rwL(w(t)).

Similar Docs  Excel Report  more

TitleSimilaritySource
None found