Interpretable Reinforcement Learning for Load Balancing using Kolmogorov-Arnold Networks
Singh, Kamal, Marouani, Sami, Sheikh, Ahmad Al, Quang, Pham Tran Anh, Habrard, Amaury
–arXiv.org Artificial Intelligence
As load and delta load increase, the policy puts more flows on the Internet link. Increasing Internet delay puts the flows on MPLS. The contribution of Internet loss seems counter intuitive as it seems to put more load on Internet Link. However, even if its coefficient is near to 1.0, the overall contribution of the term is negligible as compared to load because loss in our scenario varies from 0 to around 0.15. This applies to delay too. For minimising loss, we extract the following: a 1. 9 1 .1( 2 λ 3 + 1) 2 2λ i 5 + 10 d i 3 + u i 10 (4) This policy can be interpreted as follows, and we may refer to Figure 1 as well. The ratio starts near 0.8 and increasing load, with increasing delta, puts more traffic on Internet link. Increasing Internet delay and Internet link utilisation slightly shifts the balance towards putting more traffic on MPLS link. Distillation of symbolic equations of PPO policy: In this method, we train policy using PPO, generate trajectory data and then generate the symbolic equations using auto-regressive models [22].
arXiv.org Artificial Intelligence
May-21-2025
- Country:
- Europe > France (0.05)
- North America > United States (0.04)
- South America > Chile
- Genre:
- Research Report (0.40)
- Industry:
- Energy (0.49)
- Telecommunications (0.48)
- Technology: