Bi-level Latent Variable Model for Sample-Efficient Multi-Agent Reinforcement Learning