AITopics | byzantine machine

Collaborating Authors

byzantine machine

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reviews: Byzantine Stochastic Gradient Descent

Neural Information Processing SystemsJan-20-2025, 04:25:27 GMT

The paper studies stochastic convex optimization in a distributed master/workers framework, where on each round each machine out of m produces a stochastic gradient and sends it to the master, which aggregates these into a mini-batch. In this paper the authors allow a fraction of alpha of the machines to be Byzantine, i.e., they do not need to report valid stochastic gradients but may produce arbitrary vectors, even in an adversarial manner. The goal is to aggregate the reports of the machines and to converge to an optimal solution of the convex objective despite the malicious Byzantine machines. The authors present a novel variant of minibatch-SGD which tackles the difficulty the dealing with Byzantine machines. They prove upper-bounds on the convergence and nearly optimal matching lower-bounds on any algorithm working in such framework, and in this sense the results are quite satisfactory.

byzantine machine, byzantine stochastic gradient descent, current paper, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.85)

Add feedback

Byzantine-Robust Clustered Federated Learning

Tao, Zhixu, Yang, Kun, Kulkarni, Sanjeev R.

arXiv.org Artificial IntelligenceJun-1-2023

This paper focuses on the problem of adversarial attacks from Byzantine machines in a Federated Learning setting where non-Byzantine machines can be partitioned into disjoint clusters. In this setting, non-Byzantine machines in the same cluster have the same underlying data distribution, and different clusters of non-Byzantine machines have different learning tasks. Byzantine machines can adversarially attack any cluster and disturb the training process on clusters they attack. In the presence of Byzantine machines, the goal of our work is to identify cluster membership of non-Byzantine machines and optimize the models learned by each cluster. We adopt the Iterative Federated Clustering Algorithm (IFCA) framework of Ghosh et al. (2020) to alternatively estimate cluster membership and optimize models. In order to make this framework robust against adversarial attacks from Byzantine machines, we use coordinate-wise trimmed mean and coordinate-wise median aggregation methods used by Yin et al. (2018). Specifically, we propose a new Byzantine-Robust Iterative Federated Clustering Algorithm to improve on the results in Ghosh et al. (2019). We prove a convergence rate for this algorithm for strongly convex loss functions. We compare our convergence rate with the convergence rate of an existing algorithm, and we demonstrate the performance of our algorithm on simulated data.

artificial intelligence, byzantine machine, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2306.00638

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > Virginia (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Variance Reduced Median-of-Means Estimator for Byzantine-Robust Distributed Inference

Tu, Jiyuan, Liu, Weidong, Mao, Xiaojun, Chen, Xi

arXiv.org Machine LearningMar-4-2021

This paper develops an efficient distributed inference algorithm, which is robust against a moderate fraction of Byzantine nodes, namely arbitrary and possibly adversarial machines in a distributed learning system. In robust statistics, the median-of-means (MOM) has been a popular approach to hedge against Byzantine failures due to its ease of implementation and computational efficiency. However, the MOM estimator has the shortcoming in terms of statistical efficiency. The first main contribution of the paper is to propose a variance reduced median-of-means (VRMOM) estimator, which improves the statistical efficiency over the vanilla MOM estimator and is computationally as efficient as the MOM. Based on the proposed VRMOM estimator, we develop a general distributed inference algorithm that is robust against Byzantine failures. Theoretically, our distributed algorithm achieves a fast convergence rate with only a constant number of rounds of communications. We also provide the asymptotic normality result for the purpose of statistical inference. To the best of our knowledge, this is the first normality result in the setting of Byzantine-robust distributed learning. The simulation results are also presented to illustrate the effectiveness of our method.

byzantine machine, estimator, vrmom estimator, (13 more...)

arXiv.org Machine Learning

2103.0286

Country:

Asia > Middle East > Jordan (0.04)
Asia > China > Shanghai > Shanghai (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Communication-efficient Byzantine-robust distributed learning with statistical guarantee

Zhou, Xingcai, Chang, Le, Xu, Pengfei, Lv, Shaogao

arXiv.org Machine LearningFeb-27-2021

Communication efficiency and robustness are two major issues in modern distributed learning framework. This is due to the practical situations where some computing nodes may have limited communication power or may behave adversarial behaviors. To address the two issues simultaneously, this paper develops two communication-efficient and robust distributed learning algorithms for convex problems. Our motivation is based on surrogate likelihood framework and the median and trimmed mean operations. Particularly, the proposed algorithms are provably robust against Byzantine failures, and also achieve optimal statistical rates for strong convex losses and convex (non-smooth) penalties. For typical statistical models such as generalized linear models, our results show that statistical errors dominate optimization errors in finite iterations. Simulated and real data experiments are conducted to demonstrate the numerical performance of our algorithms.

algorithm, theorem 3, worker machine, (16 more...)

arXiv.org Machine Learning

2103.00373

Country:

Asia > Middle East > Jordan (0.05)
Asia > China > Jiangsu Province > Nanjing (0.04)
North America > United States > New York (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Communication-Efficient and Byzantine-Robust Distributed Learning

Ghosh, Avishek, Maity, Raj Kumar, Kadhe, Swanand, Mazumdar, Arya, Ramchandran, Kannan

arXiv.org Machine LearningNov-21-2019

We develop a communication-efficient distributed learning algorithm that is robust against Byzantine worker machines. We propose and analyze a distributed gradient-descent algorithm that performs a simple thresholding based on gradient norms to mitigate Byzantine failures. We show the (statistical) error-rate of our algorithm matches that of [YCKB18], which uses more complicated schemes (like coordinate-wise median or trimmed mean) and thus optimal. Furthermore, for communication efficiency, we consider a generic class of {\delta}-approximate compressors from [KRSJ19] that encompasses sign-based compressors and top-k sparsification. Our algorithm uses compressed gradients and gradient norms for aggregation and Byzantine removal respectively. We establish the statistical error rate of the algorithm for arbitrary (convex or non-convex) smooth loss function. We show that, in the regime when the compression factor {\delta} is constant and the dimension of the parameter space is fixed, the rate of convergence is not affected by the compression operation, and hence we effectively get the compression for free. Moreover, we extend the compressed gradient descent algorithm with error feedback proposed in [KRSJ19] for the distributed setting. We have experimentally validated our results and shown good performance in convergence for convex (least-square regression) and non-convex (neural network training) problems.

algorithm, compressor, gradient, (15 more...)

arXiv.org Machine Learning

1911.09721

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Robust Federated Learning in a Heterogeneous Environment

Ghosh, Avishek, Hong, Justin, Yin, Dong, Ramchandran, Kannan

arXiv.org Machine LearningJun-15-2019

We study a recently proposed large-scale distributed learning paradigm, namely Federated Learning, where the worker machines are end users' own devices. Statistical and computational challenges arise in Federated Learning particularly in the presence of heterogeneous data distribution (i.e., data points on different devices belong to different distributions signifying different clusters) and Byzantine machines (i.e., machines that may behave abnormally, or even exhibit arbitrary and potentially adversarial behavior). To address the aforementioned challenges, first we propose a general statistical model for this problem which takes both the cluster structure of the users and the Byzantine machines into account. Then, leveraging the statistical model, we solve the robust heterogeneous Federated Learning problem \emph{optimally}; in particular our algorithm matches the lower bound on the estimation error in dimension and the number of data points. Furthermore, as a by-product, we prove statistical guarantees for an outlier-robust clustering algorithm, which can be considered as the Lloyd algorithm with robust estimation. Finally, we show via synthetic as well as real data experiments that the estimation error obtained by our proposed algorithm is significantly better than the non-Byzantine-robust algorithms; in particular, we gain at least by 53\% and 33\% for synthetic and real data experiments, respectively, in typical settings.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

1906.06629

Genre: Research Report (0.50)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Byzantine Stochastic Gradient Descent

Alistarh, Dan, Allen-Zhu, Zeyuan, Li, Jerry

Neural Information Processing SystemsDec-31-2018

This paper studies the problem of distributed stochastic optimization in an adversarial setting where, out of $m$ machines which allegedly compute stochastic gradients every iteration, an $\alpha$-fraction are Byzantine, and may behave adversarially. Our main result is a variant of stochastic gradient descent (SGD) which finds $\varepsilon$-approximate minimizers of convex functions in $T = \tilde{O}\big( \frac{1}{\varepsilon^2 m} + \frac{\alpha^2}{\varepsilon^2} \big)$ iterations. In contrast, traditional mini-batch SGD needs $T = O\big( \frac{1}{\varepsilon^2 m} \big)$ iterations, but cannot tolerate Byzantine failures. Further, we provide a lower bound showing that, up to logarithmic factors, our algorithm is information-theoretically optimal both in terms of sample complexity and time complexity.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America (0.46)

Genre: Research Report (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Byzantine Stochastic Gradient Descent

Alistarh, Dan, Allen-Zhu, Zeyuan, Li, Jerry

Neural Information Processing SystemsDec-31-2018

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America (0.46)

Genre: Research Report (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning

Yin, Dong, Chen, Yudong, Ramchandran, Kannan, Bartlett, Peter

arXiv.org Machine LearningJun-14-2018

In this paper, we study robust large-scale distributed learning in the presence of saddle points in non-convex loss functions. We consider the Byzantine setting where some worker machines may have abnormal or even arbitrary and adversarial behavior. We argue that in the Byzantine setting, optimizing a non-convex function and escaping saddle points become much more challenging, even when robust gradient estimators are used. We develop ByzantinePGD, a robust and communication-efficient algorithm that can provably escape saddle points and converge to approximate local minimizers. The iteration complexity of our algorithm in the Byzantine setting matches that of standard gradient descent in the usual setting. We further provide three robust aggregation subroutines that can be used in ByzantinePGD, including median, trimmed mean, and iterative filtering. We characterize their performance in statistical settings, and argue for their near-optimality in different regimes including the high dimensional setting.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

1806.05358

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.50)

Add feedback

Filters

Collaborating Authors

byzantine machine

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

a07c2f3b3b907aaf8436a26c6d77f0a2-Paper.pdf

Reviews: Byzantine Stochastic Gradient Descent

Byzantine-Robust Clustered Federated Learning

Variance Reduced Median-of-Means Estimator for Byzantine-Robust Distributed Inference

Communication-efficient Byzantine-robust distributed learning with statistical guarantee

Communication-Efficient and Byzantine-Robust Distributed Learning

Robust Federated Learning in a Heterogeneous Environment

Byzantine Stochastic Gradient Descent

Byzantine Stochastic Gradient Descent

Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning