Reviews: Multi-Agent Common Knowledge Reinforcement Learning
–Neural Information Processing Systems
My two biggest complaints center on 1) the illustrative single-step matrix game of section 4.1 and figure 3 and 2) the practical applications of MACKRL. 1) Since the primary role of the single-step matrix game in section 4.1 is illustrative, it should be much clearer what is going on. How are all 3 policies parameterized? What information does each have access to? What is the training data? First, let's focus on the JAL policy. As presented up until this point in the paper, JAL means centralized training *and* execution.
Neural Information Processing Systems
Jun-2-2025, 00:33:33 GMT
- Technology: