Reinforcement Learning Driven Heuristic Optimization