Review for NeurIPS paper: Robust Reinforcement Learning via Adversarial training with Langevin Dynamics
–Neural Information Processing Systems
Weaknesses: The main weakness of the work is in its presentation. For a reader that is not intimately familiar with the background material, Section 2 is not self contained, and the significance of the concept of mixed NE vs pure NE is not explained. But perhaps the main area of the paper that would benefit greatly from additional discussion is the experiments section, which currently features a very large Figure 4 (consider cutting down to half the current size) and little discussion of the results themselves. When does the proposed method work better vs worse than the baselines, and is there intuition for why? Some videos of the learned policies would also nicely supplement the results.
Neural Information Processing Systems
Jan-24-2025, 18:43:42 GMT
- Technology: