Review for NeurIPS paper: Bayesian Multi-type Mean Field Multi-agent Imitation Learning

Neural Information Processing Systems 

Clarity: The paper is generally well-written, though suffers from a lack of clarity in some important sections: 4. [Equation 1] ] I believe the inner log in the right hand term of Equation (1) should not be present. I assumed it was a typo, but it is present throughout the text, even for the authors' proposed approach (e.g., in Equation 3). If intentional, why is this necessary? The paper introduces the problem scenario as a Markov game in Section 2.1; however, it introduces the notion of binary observations (which are a function of rewards here) in Section 3.1.1 This seems to suggest that perhaps the problem formulation should be corrected to a Partially Observable Markov game (POSG).