Optimizing Quantum Error Correction Codes with Reinforcement Learning