Reinforcement Learning-based Adaptive Path Selection for Programmable Networks