Data efficient reinforcement learning and adaptive optimal perimeter control of network traffic dynamics