Reinforcement learning for graph theory, Parallelizing Wagner's approach