Reviews: On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport
–Neural Information Processing Systems
This paper considers the problem of optimizing over measures instead of parameters directly ( as is standard in ML), for differentiable predictors with convex loss. This is an infinite dimensional convex optimization problem. The paper considers instead optimizing with m particles (dirac deltas). As m tends to infinity this corresponds to optimizing over the measure space. Proposition 2.3 shows existence and uniqueness of the particle gradient flow for a given initialization.
Neural Information Processing Systems
Oct-7-2024, 22:03:19 GMT