Advancing Multi-agent Traffic Simulation via R1-Style Reinforcement Fine-Tuning