Reviews: Byzantine Stochastic Gradient Descent

Jan-20-2025, 04:25:27 GMT–Neural Information Processing Systems

The paper studies stochastic convex optimization in a distributed master/workers framework, where on each round each machine out of m produces a stochastic gradient and sends it to the master, which aggregates these into a mini-batch. In this paper the authors allow a fraction of alpha of the machines to be Byzantine, i.e., they do not need to report valid stochastic gradients but may produce arbitrary vectors, even in an adversarial manner. The goal is to aggregate the reports of the machines and to converge to an optimal solution of the convex objective despite the malicious Byzantine machines. The authors present a novel variant of minibatch-SGD which tackles the difficulty the dealing with Byzantine machines. They prove upper-bounds on the convergence and nearly optimal matching lower-bounds on any algorithm working in such framework, and in this sense the results are quite satisfactory.

byzantine machine, byzantine stochastic gradient descent, current paper, (3 more...)

Neural Information Processing Systems

Jan-20-2025, 04:25:27 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Mathematical & Statistical Methods (0.85)
  - Machine Learning > Statistical Learning
    - Gradient Descent (1.00)