HyperMARL: Adaptive Hypernetworks for Multi-Agent RL

Neural Information Processing Systems 

Adaptive cooperation in multi-agent reinforcement learning (MARL) requires policies to express homogeneous, specialised, or mixed behaviours, yet achieving this adaptivity remains a critical challenge. While parameter sharing (PS) is standard for efficient learning, it notoriously suppresses the behavioural diversity required for specialisation. This failure is largely due to cross-agent gradient interference, a problem we find is surprisingly exacerbated by the common practice of . Existing remedies typically add complexity through altered objectives, manual preset diversity levels, or sequential updates -- raising a fundamental question: We propose a solution built on a key insight: an agent-conditioned hypernetwork can generate agent-specific parameters and observation-and agent-conditioned gradients, directly countering the interference from coupling agent IDs with observations.