participation
- North America > United States (0.14)
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Workflow (0.46)
- Research Report > New Finding (0.45)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Security & Privacy (0.92)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- North America > United States > Virginia (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Africa > South Sudan > Equatoria > Central Equatoria > Juba (0.04)
- Research Report > Experimental Study (0.98)
- Research Report > New Finding (0.70)
- Personal (0.68)
- Law (1.00)
- Information Technology > Security & Privacy (0.69)
Double Machine Learning Density Estimation for Local Treatment Effects with Instruments
Local treatment effects are a common quantity found throughout the empirical sciences that measure the treatment effect among those who comply with what they are assigned. Most of the literature is focused on estimating the average of such quantity, which is called the " local average treatment effect (LATE) " [
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Iowa (0.04)
- North America > Greenland (0.04)
- Asia > Taiwan > Taiwan Province > Taipei (0.04)
SA-PEF: Step-Ahead Partial Error Feedback for Efficient Federated Learning
Redie, Dawit Kiros, Arablouei, Reza, Werner, Stefan
Biased gradient compression with error feedback (EF) reduces communication in federated learning (FL), but under non-IID data, the residual error can decay slowly, causing gradient mismatch and stalled progress in the early rounds. We propose step-ahead partial error feedback (SA-PEF), which integrates step-ahead (SA) correction with partial error feedback (PEF). SA-PEF recovers EF when the step-ahead coefficient α = 0 and step-ahead EF (SAEF) when α = 1. For non-convex objectives and δ-contractive compressors, we establish a second-moment bound and a residual recursion that guarantee convergence to stationar-ity under heterogeneous data and partial client participation. To balance SAEF's rapid warm-up with EF's long-term stability, we select α near its theory-predicted optimum. Experiments across diverse architectures and datasets show that SA-PEF consistently reaches target accuracy faster than EF. Modern large-scale machine learning increasingly relies on distributed computation, where both data and compute are spread across many devices. Federated learning (FL) enables model training in this setting without centralizing raw data, enhancing privacy and scalability under heterogeneous client distributions (McMahan et al., 2017; Kairouz et al., 2021). In each synchronous FL round, the server broadcasts the current global model to a subset of clients. These clients perform several steps of stochastic gradient descent (SGD) on their local data and return updates to the server, which aggregates them to form the next global iterate (Huang et al., 2022; Wang & Ji, 2022; Li et al., 2024). Although FL leverages rich distributed data, it faces two key challenges.
- Europe > Norway (0.14)
- Oceania > Australia (0.04)
- North America > United States > Virginia (0.04)
- Europe > Finland (0.04)
FedSGM: A Unified Framework for Constraint Aware, Bidirectionally Compressed, Multi-Step Federated Optimization
Upadhyay, Antesh, Moon, Sang Bin, Hashemi, Abolfazl
We introduce FedSGM, a unified framework for federated constrained optimization that addresses four major challenges in federated learning (FL): functional constraints, communication bottlenecks, local updates, and partial client participation. Building on the switching gradient method, FedSGM provides projection-free, primal-only updates, avoiding expensive dual-variable tuning or inner solvers. To handle communication limits, FedSGM incorporates bi-directional error feedback, correcting the bias introduced by compression while explicitly understanding the interaction between compression noise and multi-step local updates. We derive convergence guarantees showing that the averaged iterate achieves the canonical $\boldsymbol{\mathcal{O}}(1/\sqrt{T})$ rate, with additional high-probability bounds that decouple optimization progress from sampling noise due to partial participation. Additionally, we introduce a soft switching version of FedSGM to stabilize updates near the feasibility boundary. To our knowledge, FedSGM is the first framework to unify functional constraints, compression, multiple local updates, and partial client participation, establishing a theoretically grounded foundation for constrained federated learning. Finally, we validate the theoretical guarantees of FedSGM via experimentation on Neyman-Pearson classification and constrained Markov decision process (CMDP) tasks.
- North America > United States > Virginia (0.04)
- North America > United States > Wisconsin (0.04)
- North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
- (3 more...)
First Provable Guarantees for Practical Private FL: Beyond Restrictive Assumptions
Shulgin, Egor, Malinovsky, Grigory, Khirirat, Sarit, Richtárik, Peter
Federated Learning (FL) enables collaborative training on decentralized data. Differential privacy (DP) is crucial for FL, but current private methods often rely on unrealistic assumptions (e.g., bounded gradients or heterogeneity), hindering practical application. Existing works that relax these assumptions typically neglect practical FL features, including multiple local updates and partial client participation. We introduce Fed-$α$-NormEC, the first differentially private FL framework providing provable convergence and DP guarantees under standard assumptions while fully supporting these practical features. Fed-$α$-NormE integrates local updates (full and incremental gradient steps), separate server and client stepsizes, and, crucially, partial client participation, which is essential for real-world deployment and vital for privacy amplification. Our theoretical guarantees are corroborated by experiments on private deep learning tasks.
- North America > Canada > Ontario > Toronto (0.14)
- Asia > Singapore (0.04)
- Asia > Middle East > Saudi Arabia > Mecca Province > Thuwal (0.04)
A Computation and Communication Efficient Method for Distributed Nonconvex Problems in the Partial Participation Setting
We present a new method that includes three key components of distributed optimization and federated learning: variance reduction of stochastic gradients, partial participation, and compressed communication. We prove that the new method has optimal oracle complexity and state-of-the-art communication complexity in the partial participation setting. Regardless of the communication compression feature, our method successfully combines variance reduction and partial participation: we get the optimal oracle complexity, never need the participation of all nodes, and do not require the bounded gradients (dissimilarity) assumption.
Learning Agent Representations for Ice Hockey
Team sports is a new application domain for agent modeling with high real-world impact. A fundamental challenge for modeling professional players is their large number (over 1K), which includes many bench players with sparse participation in a game season. The diversity and sparsity of player observations make it difficult to extend previous agent representation models to the sports domain. This paper develops a new approach for agent representations, based on a Markov game model, that is tailored towards applications in professional ice hockey. We introduce a novel player representation via player generation framework where a variational encoder embeds player information with latent variables. The encoder learns a context-specific shared prior to induce a shrinkage effect for the posterior player representations, allowing it to share statistical information across players with different participations. To model the play dynamics in sequential sports data, we design a Variational Recurrent Ladder Agent Encoder (VaRLAE). It learns a contextualized player representation with a hierarchy of latent variables that effectively prevents latent posterior collapse.