Reviews: GradiVeQ: Vector Quantization for Bandwidth-Efficient Gradient Aggregation in Distributed CNN Training

Oct-8-2024, 04:41:19 GMT–Neural Information Processing Systems

This describes how a PCA during full-vector can be used to predict a good compression method. They do a good job arguing that PCA among gradient vectors should be a common case in machine learning. On the down side, it requires a fair amount of coding work to include it in what you do, because you still have a periodic "full gradient" phase. The PCA and how it is approximated are practical heuristics, so I don't expect a proof to be possible without a bit of fine print. I don't think there was a discussion of what happens with rare classes.

bandwidth-efficient gradient aggregation, gradiveq, vector quantization, (3 more...)

Neural Information Processing Systems

Oct-8-2024, 04:41:19 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.39)