Coevolving with the Other You: Fine-Tuning LLM with Sequential Cooperative Multi-Agent Reinforcement Learning Hao Ma1,2 Zhiqiang Pu1,2