Goto

Collaborating Authors

 sgdm




Does Momentum Change the Implicit Regularization on Separable Data?

Neural Information Processing Systems

The momentum acceleration technique is widely adopted in many optimization algorithms. However, there is no theoretical answer on how the momentum affects the generalization performance of the optimization algorithms.



Birder: Communication-Efficient 1-bit Adaptive Optimizer for Practical Distributed DNN Training

Neural Information Processing Systems

Therefore, from a system-level perspective, the design ethos of a system-efficient communication-compression algorithm is that we should guarantee that the compression/decompression of the algorithm is computationally light and takes less time, and it should also be friendly to efficient collective communication primitives.






SphericalMotionDynamics

Neural Information Processing Systems

Then dynamics ofwt is like a physical process - a satellite's motion around the earth (see illustration in Fig.1): according to