Review for NeurIPS paper: Effective Diversity in Population Based Reinforcement Learning

Neural Information Processing Systems 

Weaknesses: The paper may need to be improved to address a few important issues, as detailed below. Why is it important to enhance population-wide behavioral diversity? Intuitively I can understand the potential benefits related to deep exploration and learning stability. However, theoretically I cannot link the benefits straightforwardly to the proposed use of kernel function and the kernel matrix determinant. Theorem 3.3 states that when lambda is set properly, the population will contain M distinct optimal policies.