Reinforcement Learning for Chemical Ordering in Alloy Nanoparticles