Predictive Q-Routing: A Memory-based Reinforcement Learning Approach to Adaptive Traffic Control

Choi, Samuel P. M., Yeung, Dit-Yan

Neural Information Processing Systems 

The controllers usually have no or only very little prior knowledge of the environment. While only local communication between controllers is allowed, the controllers must cooperate among themselves to achieve the common, global objective. Finding the optimal routing policy in such a distributed manner is very difficult. Moreover, since the environment is non-stationary, the optimal policy varies with time as a result of changes in network traffic and topology.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found