Reinforcement Learning Pair Trading: A Dynamic Scaling approach