Review for NeurIPS paper: Learning Implicit Credit Assignment for Cooperative Multi-Agent Reinforcement Learning

May-30-2025, 00:13:17 GMT–Neural Information Processing Systems

Weaknesses: The first essential issue in LICA algorithm is that the definition of the centralized value-function is not clear. In particular, what exactly is the proposed value function is trying to approximate? During training, this centralized value function is trained conditioned on a sampled joint action (Eq.3), while during policy updating, it is used in a way that conditions on the concatenation of the probability over actions output by each agent's policy. Due to this inconsistency in the input of the value-function, this critic should not be able to provide a correct value-estimation for the stochastic policies when calculating the policy gradient. The paper should give a further explanation and theoretical analysis of this approach.

cooperative multi-agent reinforcement learning, learning implicit credit assignment, neurips paper, (3 more...)

Neural Information Processing Systems

May-30-2025, 00:13:17 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Reinforcement Learning (0.85)
  - Representation & Reasoning > Agents
    - Agent Societies (0.40)