fedpidavg
FedPID: An Aggregation Method for Federated Learning
Mächler, Leon, Grimberg, Gustav, Ezhov, Ivan, Nickel, Manuel, Shit, Suprosanna, Naccache, David, Paetzold, Johannes C.
This paper presents FedPID, our submission to the Federated Tumor Segmentation Challenge 2024 (FETS24). Inspired by Fed-CostWAvg and FedPIDAvg, our winning contributions to FETS21 and FETS2022, we propose an improved aggregation strategy for federated and collaborative learning. FedCostWAvg is a method that averages results by considering both the number of training samples in each group and how much the cost function decreased in the last round of training. This is similar to how the derivative part of a PID controller works. In FedPIDAvg, we also included the integral part that was missing. Another challenge we faced were vastly differing dataset sizes at each center. We solved this by assuming the sizes follow a Poisson distribution and adjusting the training iterations for each center accordingly. Essentially, this part of the method controls that outliers that require too much training time are less frequently used. Based on these contributions we now adapted FedPIDAvg by changing how the integral part is computed.
FedPIDAvg: A PID controller inspired aggregation method for Federated Learning
Mächler, Leon, Ezhov, Ivan, Shit, Suprosanna, Paetzold, Johannes C.
This paper presents FedPIDAvg, the winning submission to the Federated Tumor Segmentation Challenge 2022 (FETS22). Inspired by FedCostWAvg, our winning contribution to FETS21, we contribute an improved aggregation strategy for federated and collaborative learning. FedCostWAvg is a weighted averaging method that not only considers the number of training samples of each cluster but also the size of the drop of the respective cost function in the last federated round. This can be interpreted as the derivative part of a PID controller (proportional-integral-derivative controller). In FedPIDAvg, we further add the missing integral term. Another key challenge was the vastly varying size of data samples per center. We addressed this by modeling the data center sizes as following a Poisson distribution and choosing the training iterations per center accordingly. Our method outperformed all other submissions.