General response
–Neural Information Processing Systems
Rephrase any claims that seem too strong, add additional reference and discuss more connections to previous works. Paper too long We will reorganize our paper (see general response). The red lines in bars represent median rewards. We improve reward under attacks consistently across runs. Critic attack sometimes improves PPO performance (green lines of Figure 1).
Neural Information Processing Systems
Aug-17-2025, 05:37:44 GMT
- Technology: