Review for NeurIPS paper: Joint Policy Search for Multi-agent Collaboration with Imperfect Information
–Neural Information Processing Systems
Additional Feedback: Questions/Comments - There is a slight inconsistency between Equations (1) and (3), where in (1) you have A(I(h)) and in (3) you have A(h) - Line 142 - What is meant by the notation with a bar over the v? I don't see this introduced anywhere. This is a bit confusing, since your main theorem involves the difference between two overbar v quantities. It seems like this might be the value of the root node under the policy, but that is not explicitly stated anywhere. It looks like you use the CFR1k strategy as a starting point for JPS. Do you experiment with using the other strategies (BAD and A2C) as starting points?
Neural Information Processing Systems
Feb-12-2025, 00:12:03 GMT
- Technology: