Goto

Collaborating Authors

 asfollow


FedDR-RandomizedDouglas-RachfordSplittingAlgorithms forNonconvexFederatedCompositeOptimization ATheAnalysisofAlgorithm1: RandomizedCoordinateVariant--FedDR

Neural Information Processing Systems

FedAvg: FedAvg [29] has become a de facto standard federated learning algorithm in practice. However,it has several limitations as discussed in many papers, including [23]. It is also difficult to analyze convergence of FedAvg, especially in the nonconvex case andheterogeneity settings (both statistical andsystem heterogeneity). Moreover,FedAvg originally specifies SGD with a fixed number of epochs and a fixed learning rate as its local solver,making itlessflexible inpractice.






SupplementaryMaterial

Neural Information Processing Systems

R(h). (23) Here for simplicity, we abused the symbolD in(22)by maximizing outh0 in the originalD. In the top-left areaP,suppose only oneexample (markedbyxwith vertical coordinate1)isconfidently labeled as positive, and the rest examples are highly inconfidently labeled, hence not to contribute to the riskR. Similarly,there isonly one confidently labeled example ()inthe bottom-right area ofP, and it is negative with vertical coordinate 1. Wheneverλ > 2, the optimalhλ is in(0,1)and can be solved by a quadratic equation. In contrast,di-MDD is immune to this problem becauseRis used only to determineh, while the di-MDD value itself is solely contributed byD. Same as the scenario of largeλ, we do not change the feature distribution of source and target domains, hence keepingD(h) = 1 |h|.



f6a8dd1c954c8506aadc764cc32b895e-Paper.pdf

Neural Information Processing Systems

Clustered attention makes use of similarities between queries and groups them in order to reduce the computational cost. In particular, we perform fast clustering using locality-sensitive hashing and K-Means and only compute the attention once per cluster.



8f0942c43fcfba4cc66a859b9fcb1bba-Supplemental-Conference.pdf

Neural Information Processing Systems

The expected improvement (EI) is a popular technique to handle the tradeoff between exploration andexploitation underuncertainty. Thistechnique hasbeen widely used in Bayesian optimization but it is not applicable for the contextual bandit problem which is a generalization of the standard bandit and Bayesian optimization.