We sincerely thank the reviewers for their helpful comments
–Neural Information Processing Systems
We sincerely thank the reviewers for their helpful comments. The baselines do not solve BiMGame & AntMaze even with optimal trajectories. Fig. D, E shows this as We see similar trends for AggreV aTeD. Although they stagnate after making some progress, their cumulative terminal-only reward is 0. (see Line 300-302). We only assume ordering of state groups, which is implicit in many tasks.
Neural Information Processing Systems
Nov-17-2025, 03:24:03 GMT
- Technology: