Not enough data to create a plot.
Try a different view from the menu above.
Elgabli, Anis
Balancing Energy Efficiency and Distributional Robustness in Over-the-Air Federated Learning
Badi, Mohamed, Issaid, Chaouki Ben, Elgabli, Anis, Bennis, Mehdi
The growing number of wireless edge devices has magnified challenges concerning energy, bandwidth, latency, and data heterogeneity. These challenges have become bottlenecks for distributed learning. To address these issues, this paper presents a novel approach that ensures energy efficiency for distributionally robust federated learning (FL) with over air computation (AirComp). In this context, to effectively balance robustness with energy efficiency, we introduce a novel client selection method that integrates two complementary insights: a deterministic one that is designed for energy efficiency, and a probabilistic one designed for distributional robustness. Simulation results underscore the efficacy of the proposed algorithm, revealing its superior performance compared to baselines from both robustness and energy efficiency perspectives, achieving more than 3-fold energy savings compared to the considered baselines.
FedNew: A Communication-Efficient and Privacy-Preserving Newton-Type Method for Federated Learning
Elgabli, Anis, Issaid, Chaouki Ben, Bedi, Amrit S., Rajawat, Ketan, Bennis, Mehdi, Aggarwal, Vaneet
While first-order methods for FL problems are extensively studied in the literature, recently a handful of second-order methods have been proposed for FL problems. The main advantage of second-order methods is their faster convergence rate, although they suffer from high communication costs. Besides that, sharing the first and second-order information (which contains client's data) may create a privacy issue. This is because the Hessian matrix contains valuable information about the properties of the local function/data, and when shared with the parameter server (PS), privacy may be violated by eavesdroppers or an honest-but-curious PS. For instance, authors in [Yin et al., 2014] show that the eigenvalues of the Hessian matrix can be used to extract important information of the input images. In this work, we are interested in developing communication efficient second-order methods while still preserving the privacy of the individual clients. To this end, we propose a novel framework that hides the gradients as well as the Hessians of the local functions, yet uses second-order information to solve the FL problem. In particular, we divide the standard Newton step into outer and inner levels. The objective of the inner level is to learn what we call the Hessian inverse-gradient product or Zeroth Hessian inverse-gradient product for the computation-efficient version (i.e., (
Energy-Efficient and Federated Meta-Learning via Projected Stochastic Gradient Ascent
Elgabli, Anis, Issaid, Chaouki Ben, Bedi, Amrit S., Bennis, Mehdi, Aggarwal, Vaneet
In this paper, we propose an energy-efficient federated meta-learning framework. The objective is to enable learning a meta-model that can be fine-tuned to a new task with a few number of samples in a distributed setting and at low computation and communication energy consumption. We assume that each task is owned by a separate agent, so a limited number of tasks is used to train a meta-model. Assuming each task was trained offline on the agent's local data, we propose a lightweight algorithm that starts from the local models of all agents, and in a backward manner using projected stochastic gradient ascent (P-SGA) finds a meta-model. The proposed method avoids complex computations such as computing hessian, double looping, and matrix inversion, while achieving high performance at significantly less energy consumption compared to the state-of-the-art methods such as MAML and iMAML on conducted experiments for sinusoid regression and image classification tasks.
BayGo: Joint Bayesian Learning and Information-Aware Graph Optimization
Alshammari, Tamara, Samarakoon, Sumudu, Elgabli, Anis, Bennis, Mehdi
This article deals with the problem of distributed machine learning, in which agents update their models based on their local datasets, and aggregate the updated models collaboratively and in a fully decentralized manner. In this paper, we tackle the problem of information heterogeneity arising in multi-agent networks where the placement of informative agents plays a crucial role in the learning dynamics. Specifically, we propose BayGo, a novel fully decentralized joint Bayesian learning and graph optimization framework with proven fast convergence over a sparse graph. Under our framework, agents are able to learn and communicate with the most informative agent to their own learning. Unlike prior works, our framework assumes no prior knowledge of the data distribution across agents nor does it assume any knowledge of the true parameter of the system. The proposed alternating minimization based framework ensures global connectivity in a fully decentralized way while minimizing the number of communication links. We theoretically show that by optimizing the proposed objective function, the estimation error of the posterior probability distribution decreases exponentially at each iteration. Via extensive simulations, we show that our framework achieves faster convergence and higher accuracy compared to fully-connected and star topology graphs.
Communication Efficient Distributed Learning with Censored, Quantized, and Generalized Group ADMM
Issaid, Chaouki Ben, Elgabli, Anis, Park, Jihong, Bennis, Mehdi
In this paper, we propose a communication-efficiently decentralized machine learning framework that solves a consensus optimization problem defined over a network of inter-connected workers. The proposed algorithm, Censored-and-Quantized Generalized GADMM (CQ-GGADMM), leverages the novel worker grouping and decentralized learning ideas of Group Alternating Direction Method of Multipliers (GADMM), and pushes the frontier in communication efficiency by extending its applicability to generalized network topologies, while incorporating link censoring for negligible updates after quantization. We theoretically prove that CQ-GGADMM achieves the linear convergence rate when the local objective functions are strongly convex under some mild assumptions. Numerical simulations corroborate that CQ-GGADMM exhibits higher communication efficiency in terms of the number of communication rounds and transmit energy consumption without compromising the accuracy and convergence speed, compared to the benchmark schemes based on decentralized ADMM without censoring, quantization, and/or the worker grouping method of GADMM.
Communication-Efficient and Distributed Learning Over Wireless Networks: Principles and Applications
Park, Jihong, Samarakoon, Sumudu, Elgabli, Anis, Kim, Joongheon, Bennis, Mehdi, Kim, Seong-Lyun, Debbah, Mรฉrouane
Machine learning (ML) is a promising enabler for the fifth generation (5G) communication systems and beyond. By imbuing intelligence into the network edge, edge nodes can proactively carry out decision-making, and thereby react to local environmental changes and disturbances while experiencing zero communication latency. To achieve this goal, it is essential to cater for high ML inference accuracy at scale under time-varying channel and network dynamics, by continuously exchanging fresh data and ML model updates in a distributed way. Taming this new kind of data traffic boils down to improving the communication efficiency of distributed learning by optimizing communication payload types, transmission techniques, and scheduling, as well as ML architectures, algorithms, and data processing methods. To this end, this article aims to provide a holistic overview of relevant communication and ML principles, and thereby present communication-efficient and distributed learning frameworks with selected use cases.
Harnessing Wireless Channels for Scalable and Privacy-Preserving Federated Learning
Elgabli, Anis, Park, Jihong, Issaid, Chaouki Ben, Bennis, Mehdi
Wireless connectivity is instrumental in enabling scalable federated learning (FL), yet wireless channels bring challenges for model training, in which channel randomness perturbs each worker's model update while multiple workers' updates incur significant interference under limited bandwidth. To address these challenges, in this work we formulate a novel constrained optimization problem, and propose an FL framework harnessing wireless channel perturbations and interference for improving privacy, bandwidth-efficiency, and scalability. The resultant algorithm is coined analog federated ADMM (A-FADMM) based on analog transmissions and the alternating direct method of multipliers (ADMM). In A-FADMM, all workers upload their model updates to the parameter server (PS) using a single channel via analog transmissions, during which all models are perturbed and aggregated over-the-air. This not only saves communication bandwidth, but also hides each worker's exact model update trajectory from any eavesdropper including the honest-but-curious PS, thereby preserving data privacy against model inversion attacks. We formally prove the convergence and privacy guarantees of A-FADMM for convex functions under time-varying channels, and numerically show the effectiveness of A-FADMM under noisy channels and stochastic non-convex functions, in terms of convergence speed and scalability, as well as communication bandwidth and energy efficiency.
Q-GADMM: Quantized Group ADMM for Communication Efficient Decentralized Machine Learning
Elgabli, Anis, Park, Jihong, Bedi, Amrit S., Bennis, Mehdi, Aggarwal, Vaneet
In this paper, we propose a communication-efficient decentralized machine learning (ML) algorithm, coined quantized group ADMM (Q-GADMM). Every worker in Q-GADMM communicates only with two neighbors, and updates its model via the alternating direct method of multiplier (ADMM), thereby ensuring fast convergence while reducing the number of communication rounds. Furthermore, each worker quantizes its model updates before transmissions, thereby decreasing the communication payload sizes. We prove that Q-GADMM converges for convex loss functions, and numerically show that Q-GADMM yields 7x less communication cost while achieving almost the same accuracy and convergence speed compared to a baseline without quantization, group ADMM (GADMM).
GADMM: Fast and Communication Efficient Framework for Distributed Machine Learning
Elgabli, Anis, Park, Jihong, Bedi, Amrit S., Bennis, Mehdi, Aggarwal, Vaneet
When the data is distributed across multiple servers, efficient data exchange between the servers (or workers) for solving the distributed learning problem is an important problem and is the focus of this paper. We propose a fast, privacy-aware, and communication-efficient decentralized framework to solve the distributed machine learning (DML) problem. The proposed algorithm, GADMM, is based on Alternating Direct Method of Multiplier (ADMM) algorithm. The key novelty in GADMM is that each worker exchanges the locally trained model only with two neighboring workers, thereby training a global model with lower amount of communication in each exchange. We prove that GADMM converges faster than the centralized batch gradient descent for convex loss functions, and numerically show that it is faster and more communication-efficient than the state-of-the-art communication-efficient centralized algorithms such as the Lazily Aggregated Gradient (LAG), in linear and logistic regression tasks on synthetic and real datasets. Furthermore, we propose Dynamic GADMM (D-GADMM), a variant of GADMM, and prove its convergence under time-varying network topology of the workers.