Reinforcement Learning
NetworkGym: Reinforcement Learning Environments
We make use of four internal 12 GB NVIDIA TIT AN Xp GPUs to perform our experiments. At initialization of each environment, four UEs are randomly stationed 1.5 meters above the The L TE base station lies at ( x, z) = (40 m, 3m) . We use random seed values from 0 to 63, inclusive, for this parameter. Do not distribute. of four We train PTD3 for 10,000 steps, instead of 1,000,000 steps, which we do for TD3+BC.
NetworkGym: Reinforcement Learning Environments for Multi-Access Traffic Management in Network Simulation Momin Haider UC, Santa Barbara Ming Yin
Mobile devices such as smartphones, laptops, and tablets can often connect to multiple access networks (e.g., Wi-Fi, L TE, and 5G) simultaneously. Recent advancements facilitate seamless integration of these connections below the transport layer, enhancing the experience for apps that lack inherent multi-path support. This optimization hinges on dynamically determining the traffic distribution across networks for each device, a process referred to as multi-access traffic splitting. This paper introduces NetworkGym, a high-fidelity network environment simulator that facilitates generating multiple network traffic flows and multi-access traffic splitting.
SustainDC: Benchmarking for Sustainable Data Center Control Supplementary Information
E-14 F Reward Evaluation and Customization F-19 F.1 Load Shifting Penalty ( LS F-19 F.2 Default Reward Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-19 F.3 Customization of Reward Formulations . . . . . . . . . . . . . . . . . . . . . . . Current Workload - The current workload level, which includes both flexible and non-flexible components. The data center modeled is illustrated in Figure 1. The hot air exits the cabinets and returns to the CRAH via the ceiling.