Reviews: Machine Learning with Adversaries: Byzantine Tolerant Gradient Descent

Oct-8-2024, 13:08:53 GMT–Neural Information Processing Systems

This paper presents a formal condition for Byzantine fault-tolerant stochastic gradient aggregation, as well as an algorithm (Krum) that satisfies the condition. Given the definition, it is straightforward to see that aggregation based on linear combination (including averaging) is not Byzantine tolerant. Meanwhile the paper gives evidence that Krum is not only tolerant in theory but reasonably so in practice as well. I found the paper to be clear, well-organized, self-contained, and altogether fairly thorough. The basic motivation is clear: the literature on distributed learning has focused on statistical assumptions and well-behaved actors, but what can we do in very pessimistic conditions?

byzantine tolerant gradient descent, experiment, machine learning, (6 more...)

Neural Information Processing Systems

Oct-8-2024, 13:08:53 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.73)