Reviews: Policy Gradient With Value Function Approximation For Collective Multiagent Planning

Oct-8-2024, 08:25:43 GMT–Neural Information Processing Systems

The paper presents a policy gradient algorithm for a multiagent cooperative problem, modeled in a formalism (CDEC-POMDP) whose dynamics, like congestion games, depend on groups of agents rather than individuals. This paper follows the theme of several similar advances in theis field of complex multiagent planning, using factored models to propose practical/tractable approximations. The novelty here is the use of parameterized policies and training algorithms inspired by reinforcement learning (policy gradients). The work is well-motivated, relevant, and particularly well-presented. The theoretical results are new and important.

collective multiagent planning, policy gradient, value function approximation, (2 more...)

Neural Information Processing Systems

Oct-8-2024, 08:25:43 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning
    - Agents (1.00)
    - Uncertainty > Fuzzy Logic (0.40)
  - Machine Learning > Learning Graphical Models
    - Directed Networks > Bayesian Learning (0.33)