Mitigating Relative Over-Generalization in Multi-Agent Reinforcement Learning