Goto

Collaborating Authors

 fedpg-br


AMore on the background

Neural Information Processing Systems

A.1 SVRG and SCSG Here we provide the pseudocode for SVRG (Algorithm 2) and SCSG (Algorithm 3) seen in Lei et al. [35]. The idea of SVRG (Algorithm 2) is to reuses past full gradient computations (line 3) to reduce the variance of the current stochastic gradient estimate (line 7) before the parameter update (line 8). Note that N = 1 corresponds to a GD step (i.e., v SVRG achieves linear convergence O(1/T) using the semi-stochastic gradient. The key difference is that SCSG (Algorithm 3) considers a sequence of time-varying batch sizes (Bt and bt) and employs geometric sampling to generate the number of parameter update steps Nt in each iteration (line 6), instead of fixing the batch sizes and the number of updates as done in SVRG. Particularly when finding an -approximate solution (Definition 1) for optimizing smooth non-convex objectives, Lei et al. [35] proves that SCSG is never worse than SVRG in convergence rate and significantly outperforms SVRG when the requiredis small.




Fault-Tolerant Federated Reinforcement Learning with Theoretical Guarantee

arXiv.org Artificial Intelligence

The growing literature of Federated Learning (FL) has recently inspired Federated Reinforcement Learning (FRL) to encourage multiple agents to federatively build a better decision-making policy without sharing raw trajectories. Despite its promising applications, existing works on FRL fail to I) provide theoretical analysis on its convergence, and II) account for random system failures and adversarial attacks. Towards this end, we propose the first FRL framework the convergence of which is guaranteed and tolerant to less than half of the participating agents being random system failures or adversarial attackers. We prove that the sample efficiency of the proposed framework is guaranteed to improve with the number of agents and is able to account for such potential failures or attacks. All theoretical results are empirically verified on various RL benchmark tasks.