Reinforcement Learning in Queue-Reactive Models: Application to Optimal Execution