Nesterov-Accelerated Robust Federated Learning Over Byzantine Adversaries

Xu, Lihan, Dong, Yanjie, Wang, Gang, Zeng, Runhao, Fan, Xiaoyi, Hu, Xiping

Nov-5-2025–arXiv.org Artificial Intelligence

Abstract--We investigate robust federated learning, where a group of workers collaboratively train a shared model under the orchestration of a central server in the presence of Byzantine adversaries capable of arbitrary and potentially malicious behaviors. T o simultaneously enhance communication efficiency and robustness against such adversaries, we propose a Byzantine-resilient Nesterov-Accelerated Federated Learning (Byrd-NAFL) algorithm. Byrd-NAFL seamlessly integrates Nesterov's momentum into the federated learning process alongside Byzantine-resilient aggregation rules to achieve fast and safeguarding convergence against gradient corruption. We establish a finite-time convergence guarantee for Byrd-NAFL under non-convex and smooth loss functions with relaxed assumption on the aggregated gradients. Extensive numerical experiments validate the effectiveness of Byrd-NAFL and demonstrate the superiority over existing benchmarks in terms of convergence speed, accuracy, and resilience to diverse Byzantine attack strategies. As a promising paradigm for privacy-preserving distributed learning, federated learning (FL) leverages the parallel computational capabilities of user terminals to learn from decentralized data with the orchestration of a central server. Since its inception [1], [2], FL has been proliferating across diverse application scenarios, e.g., healthcare [3], [4], mobile edge [5], [6], and autonomous driving [7], [8]. Despite the merits in preserving user privacy, vanilla FL paradigm is still facing two major challenges, namely, Byzantine resilience [9], [10] and communication efficiency [11]. To robustify the FL paradigm, Byzantine-resilient aggregation rules, e.g., Krum [10], the component-wise median (CwMed) [15], Bulyan [16], and geometric median (GeoMed) [17], are designed to enhance the trustworthiness and reliability of the FL paradigm. Another major challenge in FL lies in enhancing communication efficiency. Current communication-efficient FL algorithms can be broadly classified into three categories: (i) communication frequency reduction [18], [19], [20], [21], [22], [12], (ii) exchanged information compression [23], [24], [25], [6], and (iii) iteration reduction [20], [26], [27], [28].

artificial intelligence, machine learning, nesterov, (16 more...)

arXiv.org Artificial Intelligence

Nov-5-2025

arXiv.org PDF

Add feedback

Country:
- Africa > Ethiopia
  - Addis Ababa > Addis Ababa (0.04)
- Asia
  - China
    - Beijing > Beijing (0.04)
    - Guangdong Province > Shenzhen (0.04)
  - Middle East
    - Jordan (0.04)
    - UAE > Abu Dhabi Emirate
      - Abu Dhabi (0.14)
  - South Korea (0.04)
- Europe > Sweden
  - Stockholm > Stockholm (0.04)
- North America
  - Canada > Quebec (0.04)
  - United States
    - District of Columbia > Washington (0.04)
    - Hawaii (0.04)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Government > Military (0.66)
- Information Technology > Security & Privacy (0.66)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks (0.67)
  - Statistical Learning (0.47)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found