Classical Simulation of Quantum Circuits Using Reinforcement Learning: Parallel Environments and Benchmark Xiao-Y ang Liu