AITopics

2509.09534

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Egger, Maximilian, Bitar, Rawad

Private Aggregation for Byzantine-Resilient Heterogeneous Federated Learning

arXiv.org Machine LearningJun-12-2025

Ensuring resilience to Byzantine clients while maintaining the privacy of the clients' data is a fundamental challenge in federated learning (FL). When the clients' data is homogeneous, suitable countermeasures were studied from an information-theoretic perspective utilizing secure aggregation techniques while ensuring robust aggregation of the clients' gradients. However, the countermeasures used fail when the clients' data is heterogeneous. Suitable pre-processing techniques, such as nearest neighbor mixing, were recently shown to enhance the performance of those countermeasures in the heterogeneous setting. Nevertheless, those pre-processing techniques cannot be applied with the introduced privacy-preserving mechanisms. We propose a multi-stage method encompassing a careful co-design of verifiable secret sharing, secure aggregation, and a tailored symmetric private information retrieval scheme to achieve information-theoretic privacy guarantees and Byzantine resilience under data heterogeneity. We evaluate the effectiveness of our scheme on a variety of attacks and show how it outperforms the previously known techniques. Since the communication overhead of secure aggregation is non-negligible, we investigate the interplay with zero-order estimation methods that reduce the communication cost in state-of-the-art FL tasks and thereby make private aggregation scalable.

aggregation, machine learning, natural language, (16 more...)

2506.0987

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceJun-12-2025

Generalization Error Analysis for Attack-Free and Byzantine-Resilient Decentralized Learning with Data Heterogeneity

Ye, Haoxiang, Sun, Tao, Ling, Qing

--Decentralized learning, which facilitates joint model training across geographically scattered agents, has gained significant attention in the field of signal and information processing in recent years. While the optimization errors of decentralized learning algorithms have been extensively studied, their generalization errors remain relatively under-explored. As the generalization errors reflect the scalability of trained models on unseen data and are crucial in determining the performance of trained models in real-world applications, understanding the generalization errors of decentralized learning is of paramount importance. In this paper, we present fine-grained generalization error analysis for both attack-free and Byzantine-resilient decentralized learning with heterogeneous data as well as under mild assumptions, in contrast to prior studies that consider homogeneous data and/or rely on a stringent bounded stochastic gradient assumption. Our results shed light on the impact of data heterogeneity, model initialization and stochastic gradient noise - factors that have not been closely investigated before - on the generalization error of decentralized learning. We also reveal that Byzantine attacks performed by malicious agents largely affect the generalization error, and their negative impact is inherently linked to the data heterogeneity while remaining independent on the sample size. Numerical experiments on both convex and non-convex tasks are conducted to validate our theoretical findings. ECENT years have witnessed the significant advance of distributed learning, which enables geographically scattered devices to collaboratively train models, while ensuring the privacy of local data. According to the underlying network topologies, distributed learning can be classified into two categories, federated learning and decentralized learning. Federated learning relies on a central server to coordinate the learning process [2]-[8], while decentralized learning is able to operate autonomously without the need for a central server [9]-[18]. Notably, decentralized learning has gained increasing attention for its capacity to circumvent the communication bottleneck inherent in federated learning, caused by the central server.

artificial intelligence, generalization error, machine learning, (14 more...)

2506.09438

Country: Asia > China (0.46)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.57)

Egger, Maximilian, Bakshi, Mayank, Bitar, Rawad

Byzantine-Resilient Zero-Order Optimization for Communication-Efficient Heterogeneous Federated Learning

arXiv.org Machine LearningJan-31-2025

We introduce CyBeR-0, a Byzantine-resilient federated zero-order optimization method that is robust under Byzantine attacks and provides significant savings in uplink and downlink communication costs. We introduce transformed robust aggregation to give convergence guarantees for general non-convex objectives under client data heterogeneity. Empirical evaluations for standard learning tasks and fine-tuning large language models show that CyBeR-0 exhibits stable performance with only a few scalars per-round communication cost and reduced memory requirements.

byzantine-resilient zero-order optimization, machine learning, natural language, (15 more...)

2502.00193

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.04)
North America > United States > Arizona (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.46)

arXiv.org Artificial IntelligenceAug-12-2024

Understanding Byzantine Robustness in Federated Learning with A Black-box Server

Zhao, Fangyuan, Xie, Yuexiang, Ren, Xuebin, Ding, Bolin, Yang, Shusen, Li, Yaliang

Federated learning (FL) becomes vulnerable to Byzantine attacks where some of participators tend to damage the utility or discourage the convergence of the learned model via sending their malicious model updates. Previous works propose to apply robust rules to aggregate updates from participators against different types of Byzantine attacks, while at the same time, attackers can further design advanced Byzantine attack algorithms targeting specific aggregation rule when it is known. In practice, FL systems can involve a black-box server that makes the adopted aggregation rule inaccessible to participants, which can naturally defend or weaken some Byzantine attacks. In this paper, we provide an in-depth understanding on the Byzantine robustness of the FL system with a black-box server. Our investigation demonstrates the improved Byzantine robustness of a black-box server employing a dynamic defense strategy. We provide both empirical evidence and theoretical analysis to reveal that the black-box server can mitigate the worst-case attack impact from a maximum level to an expectation level, which is attributed to the inherent inaccessibility and randomness offered by a black-box server.

aggregation rule, black-box server, server, (15 more...)

2408.06042

Country:

North America > United States > Virginia (0.04)
Europe > Italy > Veneto > Venice (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Air (1.00)
Information Technology > Security & Privacy (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceJul-11-2024

Generalization Error Matters in Decentralized Learning Under Byzantine Attacks

Ye, Haoxiang, Ling, Qing

Recently, decentralized learning has emerged as a popular peer-to-peer signal and information processing paradigm that enables model training across geographically distributed agents in a scalable manner, without the presence of any central server. When some of the agents are malicious (also termed as Byzantine), resilient decentralized learning algorithms are able to limit the impact of these Byzantine agents without knowing their number and identities, and have guaranteed optimization errors. However, analysis of the generalization errors, which are critical to implementations of the trained models, is still lacking. In this paper, we provide the first analysis of the generalization errors for a class of popular Byzantine-resilient decentralized stochastic gradient descent (DSGD) algorithms. Our theoretical results reveal that the generalization errors cannot be entirely eliminated because of the presence of the Byzantine agents, even if the number of training samples are infinitely large. Numerical experiments are conducted to confirm our theoretical results.

agent, generalization error, training sample, (10 more...)

2407.08632

Country: Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

arXiv.org Artificial IntelligenceDec-20-2023

On the Tradeoff between Privacy Preservation and Byzantine-Robustness in Decentralized Learning

Ye, Haoxiang, Zhu, Heng, Ling, Qing

This paper jointly considers privacy preservation and Byzantine-robustness in decentralized learning. In a decentralized network, honest-but-curious agents faithfully follow the prescribed algorithm, but expect to infer their neighbors' private data from messages received during the learning process, while dishonest-and-Byzantine agents disobey the prescribed algorithm, and deliberately disseminate wrong messages to their neighbors so as to bias the learning process. For this novel setting, we investigate a generic privacy-preserving and Byzantine-robust decentralized stochastic gradient descent (SGD) framework, in which Gaussian noise is injected to preserve privacy and robust aggregation rules are adopted to counteract Byzantine attacks. We analyze its learning error and privacy guarantee, discovering an essential tradeoff between privacy preservation and Byzantine-robustness in decentralized learning -- the learning error caused by defending against Byzantine attacks is exacerbated by the Gaussian noise added to preserve privacy. For a class of state-of-the-art robust aggregation rules, we give unified analysis of the "mixing abilities". Building upon this analysis, we reveal how the "mixing abilities" affect the tradeoff between privacy preservation and Byzantine-robustness. The theoretical results provide guidelines for achieving a favorable tradeoff with proper design of robust aggregation rules. Numerical experiments are conducted and corroborate our theoretical findings.

agent, learning, robust aggregation rule, (12 more...)

2308.14606

Country:

North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Karimireddy, Sai Praneeth, He, Lie, Jaggi, Martin

Learning from History for Byzantine Robust Optimization

arXiv.org Machine LearningDec-18-2020

Byzantine robustness has received significant attention recently given its importance for distributed and federated learning. In spite of this, we identify severe flaws in existing algorithms even when the data across the participants is assumed to be identical. First, we show that most existing robust aggregation rules may not converge even in the absence of any Byzantine attackers, because they are overly sensitive to the distribution of the noise in the stochastic gradients. Secondly, we show that even if the aggregation rules may succeed in limiting the influence of the attackers in a single round, the attackers can couple their attacks across time eventually leading to divergence. To address these issues, we present two surprisingly simple strategies: a new iterative clipping procedure, and incorporating worker momentum to overcome time-coupled attacks. This is the first provably robust method for the standard stochastic non-convex optimization setting.

aggregation rule, gradient, momentum, (16 more...)

2012.10333

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.36)

He, Lie, Karimireddy, Sai Praneeth, Jaggi, Martin

Byzantine-Robust Learning on Heterogeneous Datasets via Resampling

arXiv.org Machine LearningJun-23-2020

In Byzantine robust distributed optimization, a central server wants to train a machine learning model over data distributed across multiple workers. However, a fraction of these workers may deviate from the prescribed algorithm and send arbitrary messages to the server. While this problem has received significant attention recently, most current defenses assume that the workers have identical data. For realistic cases when the data across workers is heterogeneous (non-iid), we design new attacks which circumvent these defenses leading to significant loss of performance. We then propose a simple resampling scheme that adapts existing robust algorithms to heterogeneous datasets at a negligible computational cost. We theoretically and experimentally validate our approach, showing that combining resampling with existing robust algorithms is effective against challenging attacks.

artificial intelligence, gradient, machine learning, (14 more...)

2006.09365

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)