Optimizing ZX-Diagrams with Deep Reinforcement Learning