Review for NeurIPS paper: Learning Individually Inferred Communication for Multi-Agent Cooperation

Neural Information Processing Systems 

Summary and Contributions: This paper introduces I2C, a multi-agent communication architecture for cooperative tasks wherein each agent decides who to receive messages from. This is unlike prior work in multi-agent communication that has primarily focused on broadcast-style communication -- one/all agents sending messages to all other agents. The motivation is to reduce redundant communication (which might ease learning) and make the overall setup more practically realizable. I2C consists of a "prior network", which takes as input agent i's observation and predicts a probability distribution of which other agents to receive messages from. This prior network is trained with supervised learning to minimize the KL divergence between probability of the agent's action given the actions of agents other than i and probability of the agent's action given actions of agents other than i and j; the idea being that the prior network should enable communication only from those agents who might have a strong influence on agent i's action.