Escaping Saddle Points with Compressed SGD
–Neural Information Processing Systems
Stochastic gradient descent (SGD) is a prevalent optimization technique for large-scale distributed machine learning. While SGD computation can be efficiently divided between multiple machines, communication typically becomes a bottleneck in the distributed setting.
Neural Information Processing Systems
Aug-14-2025, 14:12:09 GMT
- Country:
- Africa > Middle East
- Tunisia > Ben Arous Governorate > Ben Arous (0.04)
- Asia > Middle East
- Jordan (0.04)
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- North America > United States
- Indiana (0.04)
- New York > New York County
- New York City (0.04)
- Africa > Middle East
- Genre:
- Research Report > New Finding (0.67)
- Technology: