Goto

Collaborating Authors

 deep-rl framework


Experienced Deep Reinforcement Learning with Generative Adversarial Networks (GANs) for Model-Free Ultra Reliable Low Latency Communication

arXiv.org Machine Learning

In this paper, a novel experienced deep reinforcement learning (deep-RL) framework is proposed to provide model-free resource allocation for ultra reliable low latency communication (URLLC) in the downlink of a wireless network. The proposed, experienced deep-RL framework can guarantee high end-to-end reliability and low end-to-end latency, under explicit data rate constraints, for each wireless user without any models of or assumptions on the users' traffic. In particular, in order to enable the deep-RL framework to account for extreme network conditions and operate in highly reliable systems, a new approach based on generative adversarial networks (GANs) is proposed. This GAN approach is used to pre-train the deep-RL framework using a mix of real and synthetic data, thus creating an experienced deep-RL framework that has been exposed to a broad range of network conditions. Formally, the URLLC resource allocation problem is posed as a power minimization problem under reliability, latency, and rate constraints. To solve this problem using experienced deep-RL, first, the rate of each user is determined. Then, these rates are mapped to the resource block and power allocation vectors of the studied wireless system. Finally, the end-to-end reliability and latency of each user are used as feedback to the deep-RL framework. It is then shown that at the fixed-point of the deep-RL algorithm, the reliability and latency of the users are near-optimal. Moreover, for the proposed GAN approach, a theoretical limit for the generator output is analytically derived. Simulation results show how the proposed approach can achieve near-optimal performance within the rate-reliability-latency region, depending on the network and service requirements. The results also show that the proposed experienced deep-RL framework is able to remove the transient training time that makes conventional deep-RL methods unsuitable for URLLC. A. Taleb Zadeh Kasgari and W . Saad are with Wireless@VT, Department of ECE, Virgina Tech, Blacksburg, V A, 24060, USA. M. Mozaffari is with Ericsson Research, Santa Clara, CA, 95054, USA, Email: mohammad.mozaffari@ericsson.com. Poor is with the Department of Electrical Engineering, Princeton University, Princeton, NJ, 08544, USA, Email: poor@princeton.edu. A preliminary version of this work appeared in IEEE ICC, [1]. I NTRODUCTION Ultra reliable low latency communication (URLLC) will be one of the most important features in next-generation 5G and beyond cellular networks as it will be necessary for mission critical applications such as Internet of Things (IoT) [2] sensing and control as well as remote control of autonomous vehicles and drones [3], [4]. Thus far, prior URLLC research has been mostly focused on applications that require low data rates such as uplink transmissions of IoT sensors [3], [5].