Gündüz, Deniz
SING: Semantic Image Communications using Null-Space and INN-Guided Diffusion Models
Chen, Jiakang, Yilmaz, Selim F., You, Di, Dragotti, Pier Luigi, Gündüz, Deniz
Joint source-channel coding systems based on deep neural networks (DeepJSCC) have recently demonstrated remarkable performance in wireless image transmission. Existing methods primarily focus on minimizing distortion between the transmitted image and the reconstructed version at the receiver, often overlooking perceptual quality. This can lead to severe perceptual degradation when transmitting images under extreme conditions, such as low bandwidth compression ratios (BCRs) and low signal-to-noise ratios (SNRs). In this work, we propose SING, a novel two-stage JSCC framework that formulates the recovery of high-quality source images from corrupted reconstructions as an inverse problem. Depending on the availability of information about the DeepJSCC encoder/decoder and the channel at the receiver, SING can either approximate the stochastic degradation as a linear transformation, or leverage invertible neural networks (INNs) for precise modeling. Both approaches enable the seamless integration of diffusion models into the reconstruction process, enhancing perceptual quality. Experimental results demonstrate that SING outperforms DeepJSCC and other approaches, delivering superior perceptual quality even under extremely challenging conditions, including scenarios with significant distribution mismatches between the training and test data.
BICompFL: Stochastic Federated Learning with Bi-Directional Compression
Egger, Maximilian, Bitar, Rawad, Wachter-Zeh, Antonia, Weinberger, Nir, Gündüz, Deniz
We address the prominent communication bottleneck in federated learning (FL). We specifically consider stochastic FL, in which models or compressed model updates are specified by distributions rather than deterministic parameters. Stochastic FL offers a principled approach to compression, and has been shown to reduce the communication load under perfect downlink transmission from the federator to the clients. However, in practice, both the uplink and downlink communications are constrained. We show that bi-directional compression for stochastic FL has inherent challenges, which we address by introducing BICompFL. Our BICompFL is experimentally shown to reduce the communication cost by an order of magnitude compared to multiple benchmarks, while maintaining state-of-the-art accuracies. Theoretically, we study the communication cost of BICompFL through a new analysis of an importance-sampling based technique, which exposes the interplay between uplink and downlink communication costs.
Energy-Constrained Information Storage on Memristive Devices in the Presence of Resistive Drift
El-Geresy, Waleed, Papavassiliou, Christos, Gündüz, Deniz
In this paper, we examine the problem of information storage on memristors affected by resistive drift noise under energy constraints. We introduce a novel, fundamental trade-off between the information lifetime of memristive states and the energy that must be expended to bring the device into a particular state. We then treat the storage problem as one of communication over a noisy, energy-constrained channel, and propose a joint source-channel coding (JSCC) approach to storing images in an analogue fashion. To design an encoding scheme for natural images and to model the memristive channel, we make use of data-driven techniques from the field of deep learning for communications, namely deep joint source-channel coding (DeepJSCC), employing a generative model of resistive drift as a computationally tractable differentiable channel model for end-to-end optimisation. We introduce a modified version of generalised divisive normalisation (GDN), a biologically inspired form of normalisation, that we call conditional GDN (cGDN), allowing for conditioning on continuous channel characteristics, including the initial resistive state and the delay between storage and reading. Our results show that the delay-conditioned network is able to learn an energy-aware coding scheme that achieves a higher and more balanced reconstruction quality across a range of storage delays.
Energy-Aware Dynamic Neural Inference
Bullo, Marcello, Jardak, Seifallah, Carnelli, Pietro, Gündüz, Deniz
This work has been submitted to the IEEE for possible publication. Abstract The growing demand for intelligent applications beyond the network edge, coupled with the need for sustainable operation, are driving the seamless integration of deep learning algorithms into energy-limited, and even energy-harvesting end-devices. However, the stochastic nature of ambient energy sources often results in insufficient harvesting rates, failing to meet the energy requirements for inference and causing significant performance degradation in energy-agnostic systems. To address this problem, we consider an on-device adaptive inference system equipped with an energy-harvester and finite-capacity energy storage. We then allow the device to reduce the run-time execution cost on-demand, by either switching between differently-sized neural networks, referred to as multi-model selection (MMS), or by enabling earlier predictions at intermediate layers, called early exiting (EE). The model to be employed, or the exit point is then dynamically chosen based on the energy storage and harvesting process states. We also study the efficacy of integrating the prediction confidence into the decision-making process. We derive a principled policy with theoretical guarantees for confidence-aware and -agnostic controllers. Moreover, in multi-exit networks, we study the advantages of taking decisions incrementally, exit-by-exit, by designing a lightweight reinforcement learning-based controller. Experimental results show that, as the rate of the ambient energy increases, energy-and confidence-aware control schemes show approximately 5% improvement in accuracy compared to their energy-aware confidence-agnostic counterparts. Incremental approaches achieve even higher accuracy, particularly when the energy storage capacity is limited relative to the energy consumption of the inference model. HE widespread presence of interconnected devices, driven by pervasive and ubiquitous computing paradigms, continuously generates an unprecedented volume of data.
DiffCP: Ultra-Low Bit Collaborative Perception via Diffusion Model
Mao, Ruiqing, Wu, Haotian, Jia, Yukuan, Nan, Zhaojun, Sun, Yuxuan, Zhou, Sheng, Gündüz, Deniz, Niu, Zhisheng
Collaborative perception (CP) is emerging as a promising solution to the inherent limitations of stand-alone intelligence. However, current wireless communication systems are unable to support feature-level and raw-level collaborative algorithms due to their enormous bandwidth demands. In this paper, we propose DiffCP, a novel CP paradigm that utilizes a specialized diffusion model to efficiently compress the sensing information of collaborators. By incorporating both geometric and semantic conditions into the generative model, DiffCP enables feature-level collaboration with an ultra-low communication cost, advancing the practical implementation of CP systems. This paradigm can be seamlessly integrated into existing CP algorithms to enhance a wide range of downstream tasks. Through extensive experimentation, we investigate the trade-offs between communication, computation, and performance. Numerical results demonstrate that DiffCP can significantly reduce communication costs by 14.5-fold while maintaining the same performance as the state-of-the-art algorithm.
Joint Source-Channel Coding: Fundamentals and Recent Progress in Practical Designs
Gündüz, Deniz, Wigger, Michèle A., Tung, Tze-Yang, Zhang, Ping, Xiao, Yong
Semantic- and task-oriented communication has emerged as a promising approach to reducing the latency and bandwidth requirements of next-generation mobile networks by transmitting only the most relevant information needed to complete a specific task at the receiver. This is particularly advantageous for machine-oriented communication of high data rate content, such as images and videos, where the goal is rapid and accurate inference, rather than perfect signal reconstruction. While semantic- and task-oriented compression can be implemented in conventional communication systems, joint source-channel coding (JSCC) offers an alternative end-to-end approach by optimizing compression and channel coding together, or even directly mapping the source signal to the modulated waveform. Although all digital communication systems today rely on separation, thanks to its modularity, JSCC is known to achieve higher performance in finite blocklength scenarios, and to avoid cliff and the levelling-off effects in time-varying channel scenarios. This article provides an overview of the information theoretic foundations of JSCC, surveys practical JSCC designs over the decades, and discusses the reasons for their limited adoption in practical systems. We then examine the recent resurgence of JSCC, driven by the integration of deep learning techniques, particularly through DeepJSCC, highlighting its many surprising advantages in various scenarios. Finally, we discuss why it may be time to reconsider today's strictly separate architectures, and reintroduce JSCC to enable high-fidelity, low-latency communications in critical applications such as autonomous driving, drone surveillance, or wearable systems.
Pragmatic Communication for Remote Control of Finite-State Markov Processes
Talli, Pietro, Santi, Edoardo David, Chiariotti, Federico, Soleymani, Touraj, Mason, Federico, Zanella, Andrea, Gündüz, Deniz
Pragmatic or goal-oriented communication can optimize communication decisions beyond the reliable transmission of data, instead aiming at directly affecting application performance with the minimum channel utilization. In this paper, we develop a general theoretical framework for the remote control of finite-state Markov processes, using pragmatic communication over a costly zero-delay communication channel. To that end, we model a cyber-physical system composed of an encoder, which observes and transmits the states of a process in real-time, and a decoder, which receives that information and controls the behavior of the process. The encoder and the decoder should cooperatively optimize the trade-off between the control performance (i.e., reward) and the communication cost (i.e., channel use). This scenario underscores a pragmatic (i.e., goal-oriented) communication problem, where the purpose is to convey only the data that is most valuable for the underlying task, taking into account the state of the decoder (hence, the pragmatic aspect). We investigate two different decision-making architectures: in pull-based remote control, the decoder is the only decision-maker, while in push-based remote control, the encoder and the decoder constitute two independent decision-makers, leading to a multi-agent scenario. We propose three algorithms to optimize our system (i.e., design the encoder and the decoder policies), discuss the optimality guarantees ofs the algorithms, and shed light on their computational complexity and fundamental limits.
The Rate-Distortion-Perception Trade-off: The Role of Private Randomness
Hamdi, Yassine, Wagner, Aaron B., Gündüz, Deniz
In image compression, with recent advances in generative modeling, the existence of a trade-off between the rate and the perceptual quality (realism) has been brought to light, where the realism is measured by the closeness of the output distribution to the source. It has been shown that randomized codes can be strictly better under a number of formulations. In particular, the role of common randomness has been well studied. We elucidate the role of private randomness in the compression of a memoryless source $X^n=(X_1,...,X_n)$ under two kinds of realism constraints. The near-perfect realism constraint requires the joint distribution of output symbols $(Y_1,...,Y_n)$ to be arbitrarily close the distribution of the source in total variation distance (TVD). The per-symbol near-perfect realism constraint requires that the TVD between the distribution of output symbol $Y_t$ and the source distribution be arbitrarily small, uniformly in the index $t.$ We characterize the corresponding asymptotic rate-distortion trade-off and show that encoder private randomness is not useful if the compression rate is lower than the entropy of the source, however limited the resources in terms of common randomness and decoder private randomness may be.
Mobility Accelerates Learning: Convergence Analysis on Hierarchical Federated Learning in Vehicular Networks
Chen, Tan, Yan, Jintao, Sun, Yuxuan, Zhou, Sheng, Gündüz, Deniz, Niu, Zhisheng
Hierarchical federated learning (HFL) enables distributed training of models across multiple devices with the help of several edge servers and a cloud edge server in a privacy-preserving manner. In this paper, we consider HFL with highly mobile devices, mainly targeting at vehicular networks. Through convergence analysis, we show that mobility influences the convergence speed by both fusing the edge data and shuffling the edge models. While mobility is usually considered as a challenge from the perspective of communication, we prove that it increases the convergence speed of HFL with edge-level heterogeneous data, since more diverse data can be incorporated. Furthermore, we demonstrate that a higher speed leads to faster convergence, since it accelerates the fusion of data. Simulation results show that mobility increases the model accuracy of HFL by up to 15.1% when training a convolutional neural network on the CIFAR-10 dataset.
Multi-Agent Reinforcement Learning for Power Control in Wireless Networks via Adaptive Graphs
Amorosa, Lorenzo Mario, Skocaj, Marco, Verdone, Roberto, Gündüz, Deniz
Wireless communication networks constitute complex systems This interest can be attributed to the innate characteristics of demanding careful optimization of network procedures GNNs, which enable a scalable solution and exhibit inductive to attain predefined performance objectives. Multi-agent deep capability and, thanks to the permutation equivariance property, reinforcement learning (MADRL), owing to its inherent advantages, increased generalization. Notably, these properties find has emerged as a promising strategy for the optimization practical application in works such as [5], where GNNs are of a variety of network problems. Nevertheless, the practical harnessed to capture the dynamic structure of fading channel implementation of MADRL in real systems is hindered by states for the purpose of learning optimal resource allocation challenges related to convergence, which continue to constitute policies in wireless networks. Another domain that has witnessed an active area of research. These challenges encompass the substantial utilization of GNNs is channel management non-stationarity of the environment, the partial observability of within wireless local area networks (WLANs), as evidenced the state, as well as the coordination and cooperation among by works such as [6] and [7]. A notable insight derived from agents [1, 2]. To this end, this paper elucidates the role of the study by Gao et al. [6] is the inherent property of GNNs to leveraging graph structures as an effective means to account provide decentralized inference, rendering them a viable and for non-stationarity in MADRL systems by introducing a promising approach for the practical implementation of overthe-air relational inductive bias in the collective decision-making MADRL systems.