Pointer Networks with Q-Learning for OP Combinatorial Optimization