A Reinforcement Learning Formulation of the Lyapunov Optimization: Application to Edge Computing Systems with Queue Stability

Bae, Sohee, Han, Seungyul, Sung, Youngchul

Dec-15-2020–arXiv.org Artificial Intelligence

In this paper, a deep reinforcement learning (DRL)-based approach to the Lyapunov optimization is considered to minimize the time-average penalty while maintaining queue stability. A proper construction of state and action spaces is provided to form a proper Markov decision process (MDP) for the Lyapunov optimization. A condition for the reward function of reinforcement learning (RL) for queue stability is derived. Based on the analysis and practical RL with reward discounting, a class of reward functions is proposed for the DRL-based approach to the Lyapunov optimization. The proposed DRL-based approach to the Lyapunov optimization does not required complicated optimization at each time step and operates with general non-convex and discontinuous penalty functions. Hence, it provides an alternative to the conventional drift-plus-penalty (DPP) algorithm for the Lyapunov optimization. The proposed DRL-based approach is applied to resource allocation in edge computing systems with queue stability and numerical results demonstrate its successful operation.

edge node, node, ptq, (15 more...)

arXiv.org Artificial Intelligence

Dec-15-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Massachusetts > Plymouth County > Hanover (0.04)
- Europe > Sweden
  - Stockholm > Stockholm (0.04)
- Asia > South Korea
  - Daejeon > Daejeon (0.04)

Genre:
- Research Report (0.69)

Industry:
- Telecommunications (0.67)
- Energy > Power Industry (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Reinforcement Learning (1.00)
  - Representation & Reasoning > Optimization (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found