Reviews: ATOMO: Communication-efficient Learning via Atomic Sparsification

Oct-7-2024, 08:01:38 GMT–Neural Information Processing Systems

After rebutal; I do not wish to change my evaluation. Regarding convergence, I think that this should be clarified in the paper, to at least ensure that this is not producting divergent sequences under resaonable assumptions. As for the variance, the author control the variance of a certain variable \hat{g} given g but they should control the variance of \hat{g} without conditioning to invoke general convergence results. This is very minor but should be mentioned. The authors consider the problem of empirical risk minimization using a distributed stochastic gradient descent algorithm.

atomic sparsification, communication-efficient learning, convergence, (8 more...)

Neural Information Processing Systems

Oct-7-2024, 08:01:38 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.60)