Power Allocation for Delay Optimization in Device-to-Device Networks: A Graph Reinforcement Learning Approach

Fang, Hao, Huang, Kai, Ye, Hao, Guo, Chongtao, Liang, Le, Li, Xiao, Jin, Shi

arXiv.org Artificial Intelligence 

--The pursuit of rate maximization in wireless communication frequently encounters substantial challenges associated with user fairness. The proposed approach incorporates not only channel state information but also factors such as packet delay, the number of backlogged packets, and the number of transmitted packets into the components of the state information. We adopt a centralized RL method, where a central controller collects and processes the state information. The central controller functions as an agent trained using the proximal policy optimization (PPO) algorithm. T o better utilize topology information in the communication network and enhance the generalization of the proposed method, we embed GNN layers into both the actor and critic networks of the PPO algorithm. This integration allows for efficient parameter updates of GNNs and enables the state information to be pa-rameterized as a low-dimensional embedding, which is leveraged by the agent to optimize power allocation strategies. Simulation results demonstrate that the proposed method effectively reduces average delay while ensuring user fairness, outperforms baseline methods, and exhibits scalability and generalization capability. EVICE-TO-DEVICE (D2D) communication, which enables the direct data exchange between devices without the involvement of base stations or relay devices, can occur both within and independently of cellular network coverage [1]. This communication mode is particularly significant in 5G networks due to its potential to enhance communication efficiency, reduce delay, and increase network capacity [2]. Hao Fang, Kai Huang, Xiao Li, and Shi Jin are with the National Mobile Communications Research Laboratory, Southeast University, Nanjing 210096, China (e-mail: fhao seu@seu.edu.cn; Chongtao Guo is with the College of Electronics and Information Engineering, Shenzhen University, Shenzhen 518060, China (e-mail: ct-guo@szu.edu.cn). Le Liang is with the National Mobile Communications Research Laboratory, Southeast University, Nanjing 210096, China, and also with Purple Mountain Laboratories, Nanjing 211111, China (e-mail: lliang@seu.edu.cn). Hao Y e is with the Department of Electrical and Computer Engineering, University of California, Santa Cruz, CA 95064, USA (e-mail: yehao@ucsc.edu). These include scenarios like autonomous driving, holographic communication, and extended reality, which impose extremely stringent reliability and delay requirements.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found