Towards Tight Communication Lower Bounds for Distributed Optimisation

Oct-10-2024, 02:27:15 GMT–Neural Information Processing Systems

We consider a standard distributed optimisation setting where N machines, each holding a d -dimensional function f_i, aim to jointly minimise the sum of the functions \sum_{i 1} N f_i (x) . This problem arises naturally in large-scale distributed optimisation, where a standard solution is to apply variants of (stochastic) gradient descent. We focus on the communication complexity of this problem: our main result provides the first fully unconditional bounds on total number of bits which need to be sent and received by the N machines to solve this problem under point-to-point communication, within a given error-tolerance. Specifically, we show that \Omega( Nd \log d / N\varepsilon) total bits need to be communicated between the machines to find an additive \epsilon -approximation to the minimum of \sum_{i 1} N f_i (x) . The result holds for both deterministic and randomised algorithms, and, importantly, requires no assumptions on the algorithm structure.

communication complexity, optimisation, tight communication lower bound, (2 more...)

Neural Information Processing Systems

Oct-10-2024, 02:27:15 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.67)