Appendices

Feb-8-2026, 03:34:38 GMT–Neural Information Processing Systems

In detail, we choose UMAP [15] as the projection algorithm and train the projecting function in Hopper using 64000 transitions sampled by the expert agent. To evaluate a policy, we sample the same number of transitions, and then project them onto a 2-dimensional space by the trained projectingfunction. For empirical estimation, we subsequently discretize the projected 2-dimensional state space into small grid regions, and estimated the distribution via Kernel Density Estimation (KDE) [19]with Gaussian kernel. These twohyperparameters affect the experimental results more significantly. Moreover, as mentioned in Section 6.3, they can be tuned based onthedistribution ofthedataset.

artificial intelligence, dtv, thefollowinginequalityhold, (11 more...)

Neural Information Processing Systems

Feb-8-2026, 03:34:38 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.89)

Duplicate Docs Excel Report

Title
Appendices

Similar Docs Excel Report more

Title	Similarity	Source
None found