Jafarkhani, Hamid
Communication Compression for Distributed Learning without Control Variates
Ortega, Tomas, Huang, Chun-Yin, Li, Xiaoxiao, Jafarkhani, Hamid
Distributed learning algorithms, such as the ones employed in Federated Learning (FL), require communication compression to reduce the cost of client uploads. The compression methods used in practice are often biased, which require error feedback to achieve convergence when the compression is aggressive. In turn, error feedback requires client-specific control variates, which directly contradicts privacy-preserving principles and requires stateful clients. In this paper, we propose Compressed Aggregate Feedback (CAFe), a novel distributed learning framework that allows highly compressible client updates by exploiting past aggregated updates, and does not require control variates. We consider Distributed Gradient Descent (DGD) as a representative algorithm and provide a theoretical proof of CAFe's superiority to Distributed Compressed Gradient Descent (DCGD) with biased compression in the non-smooth regime with bounded gradient dissimilarity. Experimental results confirm that CAFe consistently outperforms distributed learning with direct compression and highlight the compressibility of the client updates with CAFe.
Offline Stochastic Optimization of Black-Box Objective Functions
Dong, Juncheng, Wu, Zihao, Jafarkhani, Hamid, Pezeshki, Ali, Tarokh, Vahid
Many challenges in science and engineering, such as drug discovery and communication network design, involve optimizing complex and expensive black-box functions across vast search spaces. Thus, it is essential to leverage existing data to avoid costly active queries of these black-box functions. To this end, while Offline Black-Box Optimization (BBO) is effective for deterministic problems, it may fall short in capturing the stochasticity of real-world scenarios. To address this, we introduce Stochastic Offline BBO (SOBBO), which tackles both black-box objectives and uncontrolled uncertainties. We propose two solutions: for large-data regimes, a differentiable surrogate allows for gradient-based optimization, while for scarce-data regimes, we directly estimate gradients under conservative field constraints, improving robustness, convergence, and data efficiency. Numerical experiments demonstrate the effectiveness of our approach on both synthetic and real-world tasks.
Quantized and Asynchronous Federated Learning
Ortega, Tomas, Jafarkhani, Hamid
Recent advances in federated learning have shown that asynchronous variants can be faster and more scalable than their synchronous counterparts. However, their design does not include quantization, which is necessary in practice to deal with the communication bottleneck. To bridge this gap, we develop a novel algorithm, Quantized Asynchronous Federated Learning (QAFeL), which introduces a hidden-state quantization scheme to avoid the error propagation caused by direct quantization. QAFeL also includes a buffer to aggregate client updates, ensuring scalability and compatibility with techniques such as secure aggregation. Furthermore, we prove that QAFeL achieves an $\mathcal{O}(1/\sqrt{T})$ ergodic convergence rate for stochastic gradient descent on non-convex objectives, which is the optimal order of complexity, without requiring bounded gradients or uniform client arrivals. We also prove that the cross-term error between staleness and quantization only affects the higher-order error terms. We validate our theoretical findings on standard benchmarks.
Modulation and Coding for NOMA and RSMA
Jafarkhani, Hamid, Maleki, Hossein, Vaezi, Mojtaba
Next-generation multiple access (NGMA) serves as an umbrella term for transmission schemes distinct from conventional orthogonal methods. A key candidate of NGMA, non-orthogonal multiple access (NOMA), emerges as a solution to enhance connectivity by allowing multiple users to share time, frequency, and space concurrently. However, NOMA faces challenges in implementation, particularly in canceling inter-user interference. In this paper, we discuss the principles behind NOMA and review conventional NOMA methods. Then, to address these challenges, we present asynchronous transmission and interference-aware modulation techniques, enabling decoding without successive interference cancellation. The goal is to design constellations that dynamically adapt to interference, minimizing bit error rates (BERs) and enhancing user throughput in the presence of inter-user, inter-carrier, and inter-cell interference. The traditional link between minimizing BER and increasing spectral efficiency is explored, with deep autoencoders for end-to-end communication emerging as a potential solution to improve BERs. Interference-aware modulation can revolutionize constellation design for non-orthogonal channels. Rate-splitting multiple access (RSMA) is another promising interference management technique in multi-user systems. In addition to addressing challenges in finite-alphabet NOMA, this paper offers new insights and provides an overview of code-domain NOMA, trellis-coded NOMA, and RSMA as key NGMA candidates. We also discuss the evolution of channel coding toward low-latency communication and examine modulation and coding schemes in 5G networks. Finally, we highlight future research directions, emphasizing their importance for realizing NOMA from concept to functional technology.
Decentralized Optimization in Time-Varying Networks with Arbitrary Delays
Ortega, Tomas, Jafarkhani, Hamid
We consider a decentralized optimization problem for networks affected by communication delays. Examples of such networks include collaborative machine learning, sensor networks, and multi-agent systems. To mimic communication delays, we add virtual non-computing nodes to the network, resulting in directed graphs. This motivates investigating decentralized optimization solutions on directed graphs. Existing solutions assume nodes know their out-degrees, resulting in limited applicability. To overcome this limitation, we introduce a novel gossip-based algorithm, called DT-GO, that does not need to know the out-degrees. The algorithm is applicable in general directed networks, for example networks with delays or limited acknowledgment capabilities. We derive convergence rates for both convex and non-convex objectives, showing that our algorithm achieves the same complexity order as centralized Stochastic Gradient Descent. In other words, the effects of the graph topology and delays are confined to higher-order terms. Additionally, we extend our analysis to accommodate time-varying network topologies. Numerical simulations are provided to support our theoretical findings.
Decentralized Optimization in Networks with Arbitrary Delays
Ortega, Tomas, Jafarkhani, Hamid
We consider the problem of decentralized optimization in networks with communication delays. To accommodate delays, we need decentralized optimization algorithms that work on directed graphs. Existing approaches require nodes to know their out-degree to achieve convergence. We propose a novel gossip-based algorithm that circumvents this requirement, allowing decentralized optimization in networks with communication delays. We prove that our algorithm converges on non-convex objectives, with the same main complexity order term as centralized Stochastic Gradient Descent (SGD), and show that the graph topology and the delays only affect the higher order terms. We provide numerical simulations that illustrate our theoretical results.
Asynchronous Federated Learning with Bidirectional Quantized Communications and Buffered Aggregation
Ortega, Tomas, Jafarkhani, Hamid
Asynchronous Federated Learning with Buffered Aggregation (FedBuff) is a state-of-the-art algorithm known for its efficiency and high scalability. However, it has a high communication cost, which has not been examined with quantized communications. To tackle this problem, we present a new algorithm (QAFeL), with a quantization scheme that establishes a shared "hidden" state between the server and clients to avoid the error propagation caused by direct quantization. This approach allows for high precision while significantly reducing the data transmitted during client-server interactions. We provide theoretical convergence guarantees for QAFeL and corroborate our analysis with experiments on a standard benchmark.
Gossiped and Quantized Online Multi-Kernel Learning
Ortega, Tomas, Jafarkhani, Hamid
In instances of online kernel learning where little prior information is available and centralized learning is unfeasible, past research has shown that distributed and online multi-kernel learning provides sub-linear regret as long as every pair of nodes in the network can communicate (i.e., the communications network is a complete graph). In addition, to manage the communication load, which is often a performance bottleneck, communications between nodes can be quantized. This letter expands on these results to non-fully connected graphs, which is often the case in wireless sensor networks. To address this challenge, we propose a gossip algorithm and provide a proof that it achieves sub-linear regret. Experiments with real datasets confirm our findings.
Distributed Q-Learning with State Tracking for Multi-agent Networked Control
Wang, Hang, Lin, Sen, Jafarkhani, Hamid, Zhang, Junshan
This paper studies distributed Q-learning for Linear Quadratic Regulator (LQR) in a multi-agent network. The existing results often assume that agents can observe the global system state, which may be infeasible in large-scale systems due to privacy concerns or communication constraints. In this work, we consider a setting with unknown system models and no centralized coordinator. We devise a state tracking (ST) based Q-learning algorithm to design optimal controllers for agents. Specifically, we assume that agents maintain local estimates of the global state based on their local information and communications with neighbors. At each step, every agent updates its local global state estimation, based on which it solves an approximate Q-factor locally through policy iteration. Assuming decaying injected excitation noise during the policy evaluation, we prove that the local estimation converges to the true global state, and establish the convergence of the proposed distributed ST-based Q-learning algorithm. The experimental studies corroborate our theoretical results by showing that our proposed method achieves comparable performance with the centralized case.