Goto

Collaborating Authors

 aa null





Content 14

Neural Information Processing Systems

First we start with the reparametrized Projected Gradient Descent algorithm. The update rule for g follows directly. Suppose that the loss does not converge to zero. Now, in general, to avoid converging to this set, we must make some additional assumptions on the initialization. However, it is also more general.


SGD Learns One-Layer Networks in WGANs

arXiv.org Machine Learning

Generative adversarial networks (GANs) are a widely used framework for learning generative models. Wasserstein GANs (WGANs), one of the most successful variants of GANs, require solving a minmax optimization problem to global optimality, but are in practice successfully trained using stochastic gradient descent-ascent. In this paper, we show that, when the generator is a one-layer network, stochastic gradient descent-ascent converges to a global solution with polynomial time and sample complexity.