Order Matters: Agent-by-agent Policy Optimization