Adaptive Gradient Quantization for Data-Parallel SGD

Neural Information Processing Systems 

Let (r)=( r ` (r))/(` (r) +1 ` (r))betherelativedistanceofr to level (r)+ 1. Wedefinetherandomvariableh(r)suchthath(r)= ` (r) withprobability1 (r) andh(r)= ` (r) +1withprobability (r).

Similar Docs  Excel Report  more

TitleSimilaritySource
None found