Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

Neural Information Processing Systems 

However, this restriction prevents it from representing value functions in which an agent's ordering over its actions can depend on