Supplementary Material A Details on experimental setups
–Neural Information Processing Systems
We first collect trajectory from the default environment (black colored transitions in figures) and visualize the next states obtained by applying the same action to the same state with different environment parameters. One can observe that transition dynamics follow multi-modal distributions. The objective of CartPoleSwingUp is to swing up the pole by moving a cart and keep the pole upright within 500 time steps. For our experiments, we modified the mass of cart and pole within the set of {0.25, 0.5, 1.5, 2.5} and evaluated the generalization performance in unseen environments with a mass of {0.1, 0.15, 2.75, 3.0}. We visualize the transitions in Figure 8a.
Neural Information Processing Systems
May-21-2025, 19:41:51 GMT
- Technology: