Supplementary Material for " AttendLight: Universal Attention-Based Reinforcement Learning Model for Traffic Signal Control "

Neural Information Processing Systems 

Appendix A includes the details of all numerical experiments. In Appendix A.1, we describe our traffic data and how we generate the synthetic data. Appendices A.2 and A.3 explains the AttendLight model as well as the RL training that we consider throughout the paper. Appendix A.4 gives a brief explanation on our baseline algorithms. Finally, in Appendix A.5, we provide extensive results of AttendLight for single-env and multi-env regimes. A.1 Details of Real-World Data and the Synthetic Data Generation In each intersection, we consider three traffic movement sets, namely straight, turn-left, and turnright.These sets are used in defining the traffic data of an intersection in both real-world and synthetic cases.