Reviews: The Sound of APALM Clapping: Faster Nonsmooth Nonconvex Optimization with Stochastic Asynchronous PALM
–Neural Information Processing Systems
I find the approach rather interesting, especially the broad and general definition of the problem makes the approach applicable to a wide range of problems. However, I was surprised by the absence of any reference to the seminal Robbins/Munro paper and also to the recent developments in stochastic gradient descent based sampling (see below). The authors do local gradient descent updates of coordinate blocks by computing partial gradients and adding noise in each asynchronous step. I was wondering, how this relates to the "usual" stochastic gradient descent update, i.e., given that the locally computed partial gradient will be based on delayed (noisy) variable states, a sequence of these noisy partial gradients would converge to the true partial gradient as well. Further, recent SGD based sampling has shown that adding noise to the variable states obtained by noisy gradient updates (as the authors do as well) provides good samples of the distribution underlying the optimal variable setting also in a non-convex setting. That being said, the work handed in remains valid, but it would have been interesting to compare the proposed approach to well established stochastic gradient methods.
Neural Information Processing Systems
Jan-20-2025, 08:34:28 GMT
- Technology: