The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces